Hello Flyte Community I hope you re all doing well I have so Flyte #flyte-support

Hello Flyte Community, I hope you're all doing wel...

square-carpet-13590

05/10/2024, 6:24 PM

Hello Flyte Community, I hope you're all doing well. I have some questions regarding subworkflows in Flyte, and I'd appreciate your guidance on the following topics: When to Use SubWorkflow: Could you please clarify the scenarios in which it's more beneficial to use SubWorkflow by importing a workflow versus using a reference workflow ( using reference_launch_plan / reference_workflow)? Are there any best practices or specific cases where one approach is preferred over the other ? any risks using reference_workflow on large scale as compared to imported workflow ? Handling SubWorkflow Failures: Is it possible to handle SubWorkflow failures from the parent workflow on node level ? we want to handle the SubWorkflow failure gracefully in the parent workflow and have its dependant tasks executed Thank you in advance for your help! cc @glamorous-rainbow-77959 @clever-exabyte-82294

tall-lock-23197

05/16/2024, 9:02 AM

hi! 1. reference workflow is useful when your workflow is present in a different project-domain; if you're able to import a workflow, then you can use it as a subworkflow directly 2. i believe so https://docs.flyte.org/en/latest/api/flytekit/generated/flytekit.WorkflowFailurePolicy.html

glamorous-rainbow-77959

05/17/2024, 7:48 AM

@tall-lock-23197 FAIL_AFTER_EXECUTABLE_NODES_COMPLETE
can allow all other nodes to complete, but how would we catch sub-workflow error inside the parent workflow? We need to act differently based on what happened there, and depending on which step has failed

tall-lock-23197

05/17/2024, 10:04 AM

if the subworkflow is in the same context as that of the current workflow, meaning if it isn't an external execution, then it should be possible.

square-carpet-13590

05/17/2024, 11:16 AM

Untitled

square-carpet-13590

05/17/2024, 11:25 AM

@tall-lock-23197 here is the example that we've tried where failure is being simulated for subworkflow_a, so it failed, now the task_g is dependant on subworkflow_a, in some cases user still might want to execute task_g without subworkflows_a's output and b, couldn't find a way handle the failure of subworkflow_a from main_workflow. Attached the screenshot and workflow code for reference failure handling from the main/parent workflow can provide flexibility and more control as one might not always have control over subworkflow's from different teams/modules while orchestrating large and complex workflows

tall-lock-23197

05/17/2024, 1:54 PM

why would you want to run

task_g

when a node output that it depends on isn't available?

square-carpet-13590

05/20/2024, 6:26 AM

so for our use-case task_a, task_b could be inferencing model workflows for and task_g could be post processing on the generated recommendations, so even if task_a fails task_g can proceed, failed sub workflows can be grouped and re-triggered later.

tall-lock-23197

05/20/2024, 7:06 AM

okay, i think this has to work when `task_a`'s output is annotated with

Optional

task_g

. @high-accountant-32689 is this something we should support?

glamorous-rainbow-77959

05/20/2024, 9:35 AM

In case this is not supported, it’s also would be great to have an advice on if it’s possible to get around this in the current Flyte version?

glamorous-rainbow-77959

05/24/2024, 8:35 AM

Hi @tall-lock-23197 @high-accountant-32689. Could someone guide us on error handling for SubWorkflows? Just re-iterating the question in case anyone has done something similar: We have a set of sub-workflows that can fail internally, but the parent workflow should gracefully handle errors and continue it’s execution. Parent workflow orchestrates execution of multiple inference pipelines with different logic of execution. Some of them are parallelizable, some are linear. And, as stated, any of them can fail. Parent workflow then needs to understand which ones failed, which completed successfully and act based on that

freezing-airport-6809

05/24/2024, 2:00 PM

So failures are failures. Failures will eventually fail the workflow unless you use map tasks which has a failure threshold. One option is to not make it a failure, return none? There is an on failure handler, but this will purposely not succeed the workflow today, Should we make it succeed? If so need help on understanding maybe like a rough prototype of a use case Last option is to use @eager - (experimental) but it allows arbitrary code and is an escape hatch

freezing-airport-6809

05/24/2024, 2:01 PM

Happy to jump on a call to understand in more detail

glamorous-rainbow-77959

05/24/2024, 2:44 PM

@freezing-airport-6809 we would be glad to jump on a call and discuss our usecase. Would Monday or Tuesday work? Which timezone are you in?

freezing-airport-6809

05/24/2024, 4:14 PM

I am in pacific time. Let me dm You

clever-exabyte-82294

07/09/2024, 9:24 PM

@freezing-airport-6809 Trying to build a POC of Eager. However unable to create FlyteRemote needed for Eager Our setup is that we are using unauthenticated https internal Flyte endpoint on AWS, with no public access. Based on the document, I see that that we need client_secret_key. However we don't have this key as we dont need authentication. https://docs.flyte.org/en/latest/user_guide/advanced_composition/eager_workflows.html#remote-flyte-cluster-execution is there a way to Handle the case where the task is running in a Flyte cluster and needs to access the cluster itself?

clever-exabyte-82294

07/09/2024, 10:36 PM

Nevermind.. got it working.

🚀 2

🙌🏽 1

5 Views

Open in Slack

Previous Next