https://flyte.org logo
#ask-the-community
Title
# ask-the-community
a

Art Gillespie

12/15/2023, 5:59 PM
👋 New to flyte. I’m developing a flyte workflow that downloads ~millions of 3d files from a list of urls, store them in blob storage, then run other render/convert tasks on the files in a custom container pre-loaded with headless blender. Everything works in small test runs, but I’m running into issues with how to handle failed tasks in
map_task
(e.g., if the download task fails permanently with 404) and I’m starting to wonder if this is the wrong design approach for this kind of workflow. I’ve built a small test using
@dynamic
for fanout instead of
map_task
but I note that even if I set the parent workflow’s
failure_policy
to
flytekit.WorkflowFailurePolicy.FAIL_AFTER_EXECUTABLE_NODES_COMPLETE
the fanned-out tasks all complete, but the
@dynamic
workflow still fails and doesn’t return partial results. Curious how others in the community approach these kind of jobs with flyte?
s

Samhita Alla

12/18/2023, 10:19 AM
have you tried setting
min_success_ratio
in your map task? it determines the minimum fraction of total jobs that must complete successfully before terminating the map task and marking it as successful.
a

Art Gillespie

12/18/2023, 6:37 PM
Thanks for the pointer! Looking at the docs, it sounds like setting
min_success_ratio
will terminate the map_task once the minimum ratio of successful tasks is reached. Am I misreading?
min_success_ratio (float) – If specified, this determines the minimum fraction of total jobs which can complete successfully before terminating this task and marking it successful.
What I think I want is for
map_task
to run the mappable task for all of the inputs, but only fail the overall task if some threshold of mapped inputs led to an error.
Ok, I wrote a small test and
min_success_ratio
appears to do what I’m looking for. Thanks! 🎉
I’m a little confused, though — the doc strings for both
core.map_task
and
models.ArrayJob
are pretty explicit that
map_task
will stop and compute results once the number of successful mapped tasks satisfies
min_success_ratio
or
min_successes
but that’s not the behavior I’m seeing. map_task.py:57
Copy code
:param min_success_ratio: If specified, this determines the minimum fraction of total jobs which must complete
            successfully before terminating this task and marking it unsuccessful
ArrayJob.py:18
Copy code
:param int min_successes: An absolute number of the minimum number of successful completions of subtasks. As
            soon as this criteria is met, the array job will be marked as successful and outputs will be computed.
s

Samhita Alla

12/19/2023, 7:28 AM
looks like we need to update the docstrings and the docs! would you filing an issue? [flyte-docs]
s

Samhita Alla

12/19/2023, 7:29 AM
also, array node serves as a replacement for map tasks. although it's still in experimental phase, we highly encourage you to use that.
2 Views