I noticed in the flyte UI that dynamics are grouped in attem Flyte #flyte-support

Join Slack

I noticed in the flyte UI that dynamics are groupe...

# flyte-support

nice-zebra-99977

12/06/2022, 5:53 PM

I noticed in the flyte UI that dynamics are grouped in “attempts” what does this mean?

hallowed-mouse-14616

12/06/2022, 6:54 PM

Can you provide a little more context? I'm not sure I fully understand.

nice-zebra-99977

12/06/2022, 6:56 PM

message has been deleted

nice-zebra-99977

12/06/2022, 6:56 PM

When running a dynamic at a very large scale we see these “attempts”

thankful-minister-83577

12/06/2022, 8:39 PM

do the earlier attempts fail?

nice-zebra-99977

12/06/2022, 8:40 PM

A few

nice-zebra-99977

12/06/2022, 8:40 PM

not the entire run just a few long running jobs from a timeout we need to fix but a few attempts had no failures and another attempt would happen after

thankful-minister-83577

12/06/2022, 8:41 PM

dan - retries for dynamic, do they re-compute the spec?

hallowed-mouse-14616

12/07/2022, 5:00 PM

So what seems to be happening here is that you have some dynamic task (ex.

n0

) that generates a number of subtasks (ex.

n0-0

n0-1

, and

n0-2

). If when executing the dynamic task subtasks

n0-0

and

n0-1

succeed but

n0-2

fail then the top level dynamic task (ie.

n0

) fails. This is then retried, which causes all of the subtasks to be executed again (even those that previously succeeded), hence multiple attempts for each subtask. Does this sound correct? I think it would make sense to not retry the top-level dynamic task after the subtask workflow closure has been generated. If one of the subtasks fail, then the dynamic task should fail. Is this what you would expect? I know this is an abstraction that we use in map tasks.

hallowed-mouse-14616

12/07/2022, 5:03 PM

@thankful-minister-83577, yes it does look like the spec is recomputed. This is obviously an inefficiency, but it should not break correctness because none of the subtask executions are recovered.

nice-zebra-99977

12/07/2022, 6:12 PM

This makes a lot of sense, so even if one retry happens it will cascade to multiple jobs.

hallowed-mouse-14616

12/07/2022, 6:56 PM

Louis, would you mind filing an issue for this? I certainly think it is functionality that should be updated. Since it doesn't effect correctness it probably won't happen in the next few days, but it would be nice to have on the roadmap 😄.

nice-zebra-99977

12/07/2022, 6:56 PM

Ya no problem!

nice-zebra-99977

12/07/2022, 6:57 PM

Ill do it later today.

🙌 1

155 Views

Open in Slack

Previous Next