Does Flyte have concept for handling errors at wor...
# ask-the-community
f
Does Flyte have concept for handling errors at workflow execution time? For example it would be nice to be able to react on a failing pod • by trying something else (schedule pod with twice as much memory to train on) • by creating a datadog / new relic incident • in general just being able to run a task as a reaction would open up a world of use cases
k
You can listen / subscribe to Flyte external event egress and do anything really. By default it does have an emailing system
f
At that point we are out of the workflow context though. It would be great to be able to react within the workflow if pods fail. This has been a feature on Kubeflow for many years. I’m happy to create a feature request as well. Edit: Looks like this exists and is the second most upvoted issue in Github. But open since 2021 😢 and removed from the 1.4 milestone https://github.com/flyteorg/flyte/issues/1506
k
Ya this exists in the backend just not in flytekit. We will get to it in summer. On the other hand an even more powerful- but experimental feature will be coming soon. Cc @Niels Bantilan We call it eager mode
f
Could you explain what “eager” mode is? 😅
n
here’s the RFC draft for eager more @Ferdinand von den Eichen: https://github.com/flyteorg/flyte/discussions/3396 In summary, eager workflows allows you to write workflows using native python
async
syntax, where you tradeoff static compilation and analysis for complete dynamism in the execution graph. See the table below that compares
@workflow
,
@dynamic
, and
@eager
See the code example in the RFC to see how you can use
try..except
,
if… elif… else
, for and while loops, etc. in your workflows
k
@James Sutton had an opinion that we should call it
@unsafe
n
instead of
@eager
?
k
ya haha - borrowing from rust
j
kind of, eager still makes a ton of sense
I think the rust analogy works
but obviously we don't want to steal a term that ends up adding more confusion
n
yeah, not sure we want to use terms that have a specific meaning in one language that doesn’t quite fit the Flyte programming paradigm.
j
eager borrows (imo) from a tf 2.0 release; when they launched "eager" mode, definitely is apt imo
n
@async_dynamic
and a few others have also been thrown into the hat, we can have further discussions about naming in the RFC draft as well
async_workflow
would also make sense
f
Very interesting, thanks for sharing! Excited to try this as I suppose it really addresses our needs