Hi Community! We are testing dynamic workflows an...
# ask-the-community
r
Hi Community! We are testing dynamic workflows and found an interesting system behaviour that I would like to discuss. We have a dynamic workflow that runs dynamic tasks, e.g.
Copy code
n0
├── n0-0-dn0
├── n0-0-dn1
├── n0-0-dn2
├── n0-0-dn3
├── n0-0-dn4
├── n0-0-dn5
├── n0-0-dn6
├── n0-0-dn7
├── n0-0-dn8
└── n0-0-dn9
n0-0-dn0 ... n0-0-dn9
nodes are running in parallel. If one of them (e.g.
n0-0-dn9
) fails, all the other nodes (
n0-0-dn0 ... n0-0-dn8
) will be aborted by Flyte. Is this the intended behavior? Is this configurable? Re-running all the nodes due to a small intermittent issue in one of the nodes could generate extra computational cost. @GF @anantharaman janakiraman @Aarthi Vellingiri
y
this is the intended behavior but it’s also customizable.
are you setting the failure mode?
there’s one called
FAIL_AFTER_EXECUTABLE_NODES_COMPLETE
r
@Yee thanks for the hint, pretty useful! no, we are not setting the failure mode, just go with the default settings let's give it a try! thanks again!