I noticed in the flyte UI that dynamics are groupe...
# ask-the-community
l
I noticed in the flyte UI that dynamics are grouped in “attempts” what does this mean?
d
Can you provide a little more context? I'm not sure I fully understand.
l
message has been deleted
When running a dynamic at a very large scale we see these “attempts”
y
do the earlier attempts fail?
l
A few
not the entire run just a few long running jobs from a timeout we need to fix but a few attempts had no failures and another attempt would happen after
y
dan - retries for dynamic, do they re-compute the spec?
d
So what seems to be happening here is that you have some dynamic task (ex.
n0
) that generates a number of subtasks (ex.
n0-0
,
n0-1
, and
n0-2
). If when executing the dynamic task subtasks
n0-0
and
n0-1
succeed but
n0-2
fail then the top level dynamic task (ie.
n0
) fails. This is then retried, which causes all of the subtasks to be executed again (even those that previously succeeded), hence multiple attempts for each subtask. Does this sound correct? I think it would make sense to not retry the top-level dynamic task after the subtask workflow closure has been generated. If one of the subtasks fail, then the dynamic task should fail. Is this what you would expect? I know this is an abstraction that we use in map tasks.
@Yee, yes it does look like the spec is recomputed. This is obviously an inefficiency, but it should not break correctness because none of the subtask executions are recovered.
l
This makes a lot of sense, so even if one retry happens it will cascade to multiple jobs.
d
Louis, would you mind filing an issue for this? I certainly think it is functionality that should be updated. Since it doesn't effect correctness it probably won't happen in the next few days, but it would be nice to have on the roadmap 😄.
l
Ya no problem!
Ill do it later today.
152 Views