Hi team.. Ever since we upgraded to flyte backend ...
# flyte-support
l
Hi team.. Ever since we upgraded to flyte backend 1.15, we’ve noticed that
flytectl register
step fails if the image version is the same as the last run. This was not the case earlier. Is this the new/expected behavior?
r
Have run into a similar issue but for
pyflyte run
, I get a lot of "duplicate task / workflow" errors when the image changes (for ContainerTask) but everything else stays the same. (in my case, I'm using image with sha256, which does change, so the error is odd)
l
Ping.. can someone please confirm?
a
hey @little-cricket-84530 what error are you seeing? is also flytekit >= 1.15?
l
yes
If everything shows “Already exists” then in the end we end up with a 502
Copy code
| Failed  | Error registering file due to rpc error: code =     |
04:00:33  |                                                                                                                                                         |         | Unavailable desc = unexpected HTTP status code      |
04:00:33  |                                                                                                                                                         |         | received from server: 502 (Bad Gateway); transport: |
04:00:33  |                                                                                                                                                         |         | received unexpected content-type "text/html"        |
if it’s a new version, then no problem
@average-finland-92144?
Happy to pull up any logs as needed…
I see
--continueOnError
as an option but I don’t want to use it so that I don’t mask any real errors
a
@little-cricket-84530 could you get logs from the flyteadmin and flytepropeller pods? (Assuming this is flyte-core)
l
on it
propeller logs appear to have data for ongoing runs.. the only other thing I see is this
Copy code
I0411 18:05:59.538566       1 trace.go:236] Trace[1036750268]: "DeltaFIFO Pop Process" ID:my-namespace/ar7n7bnx7qnqwf25hgjh-fc5lxjwq-0,Depth:4327,Reason:slow event handlers blocking the queue (11-Apr-2025 18:05:59.247) (total time: 290ms):
Trace[1036750268]: [290.401036ms] [290.401036ms] END
I0411 18:23:19.037934       1 trace.go:236] Trace[445950959]: "DeltaFIFO Pop Process" ID:my-namespace/ar7n7bnx7qnqwf25hgjh-n5-0-n4655-0,Depth:3457,Reason:slow event handlers blocking the queue (11-Apr-2025 18:23:18.147) (total time: 890ms):
Trace[445950959]: [890.165301ms] [890.165301ms] END
admin logs…
Copy code
2025/04/11 18:23:05 /go/src/github.com/flyteorg/flyteadmin/pkg/repositories/gormimpl/task_repo.go:59
[0.273ms] [rows:1] SELECT * FROM "tasks" WHERE "tasks"."project" = 'project' AND "tasks"."domain" = 'production' AND "tasks"."name" = 'flyte.workflows.my_file.map_task_839fbfe6bae4d6b30fb4f0dc51b2577e-arraynode' AND "tasks"."version" = 'a7522df3cf462b3af7d3fcbd4e5d31fce455515f' LIMIT 1

2025/04/11 18:23:05 /go/src/github.com/flyteorg/flyteadmin/pkg/repositories/gormimpl/task_repo.go:59
[0.306ms] [rows:1] SELECT * FROM "tasks" WHERE "tasks"."project" = 'project' AND "tasks"."domain" = 'production' AND "tasks"."name" = 'flyte.workflows.my_file.map_task_bd1acc9b03893350ca99f96f9365b599-arraynode' AND "tasks"."version" = 'a7522df3cf462b3af7d3fcbd4e5d31fce455515f' LIMIT 1

2025/04/11 18:23:05 /go/src/github.com/flyteorg/flyteadmin/pkg/repositories/gormimpl/workflow_repo.go:53
[0.351ms] [rows:1] SELECT * FROM "workflows" WHERE "workflows"."project" = 'project' AND "workflows"."domain" = 'production' AND "workflows"."name" = 'flyte.workflows.my_file.my_wf' AND "workflows"."version" = 'a7522df3cf462b3af7d3fcbd4e5d31fce455515f' LIMIT 1

2025/04/11 18:24:28 /go/src/github.com/flyteorg/flyteadmin/pkg/repositories/gormimpl/execution_repo.go:54
[1.330ms] [rows:1] SELECT * FROM "executions" WHERE "executions"."execution_project" = 'project' AND "executions"."execution_domain" = 'production' AND "executions"."execution_name" = 'axwbkzx4btckm822t597' LIMIT 1

2025/04/11 18:25:28 /go/src/github.com/flyteorg/flyteadmin/pkg/repositories/gormimpl/execution_repo.go:54
[1.085ms] [rows:1] SELECT * FROM "executions" WHERE "executions"."execution_project" = 'project' AND "executions"."execution_domain" = 'production' AND "executions"."execution_name" = 'axwbkzx4btckm822t597' LIMIT 1
For now I’m checking if the version exists for a given workflow and using that to determine whether to publish or not
a
@little-cricket-84530 coming back to this. are you using ImageSpec? does this happen after you make a change in the code and you trigger the process of building the protobuf (
pyflyte package
) and then registering a new wf version?
l
this happens ONLY when the image doesn’t change. We have a nightly job that builds and registers the version.
so let’s say if there is no change in the code in 24 hours, the version stays the same (we use git sha)
a
got it could you share the full output of the
flytectl register
operation? I tried same flytekit version and same pattern and cannot repro this behavior
l
I’ll send it over in a bit