Flyte workflows do not support standard Python try/except error handling at the workflow level. Workflow code is a domain-specific language (DSL) for building execution graphs, not regular Python, so exceptions raised in tasks cannot be caught with try/except in the workflow body. Instead, Flyte provides workflow-level error handling using the failure node feature, where you can specify an on_failure task or workflow to execute if any node fails. However, this does not allow you to recover and continue execution in the workflow after a failure—execution will still be marked as failed, and the failure handler is only for cleanup or notification purposes, not for resuming normal workflow logic (
Flyte failure node docs,
Flyte core concepts).
If you need to continue workflow execution after a task fails (i.e., "catch" the error and proceed), this is not natively supported in Flyte's workflow DSL. The only supported mechanism is to use the on_failure handler for cleanup or notification, not for resuming workflow logic. There is ongoing discussion in the community about adding more flexible error handling (like try/catch or allowing some nodes to fail without failing the workflow), but as of now, this is not available (
Flyte GitHub issue #5513,
Flyte GitHub issue #878). Would you like more detail or example code for using the failure node/on_failure handler?
Sources:
-
Flyte failure node docs
-
Flyte GitHub issue #5513
-
Flyte GitHub issue #878
-
Flyte core concepts 👍 0 button 👎 0 button 💡
Hint: Mention @runllm in the thread for followups.