We have a Flyte setup with a number of different workflows s Flyte #flyte-support

We have a Flyte setup with a number of different w...

strong-quill-48244

02/20/2025, 1:11 PM

We have a Flyte setup with a number of different workflows, some dependent on each other. For now, we are building a separate image for each workflow and registering them one by one. When we want to run the whole pipeline we start the workflows one by one from the UI and monitor the progress. When successful, we can launch the next one. Now we want to start programmatically orchestrating the workflows. Our initial idea is to use flyte remote to start the latest production version of each wf and upon success, start the next one. This would be built into a higher level "orchestration" flyte workflow. Issue with this approach that we must use a lot of boilerplate to launch each subworkflow. Additionally, the level of monitoring on the orchestration workflow level is not great. To see what tasks are running etc one must find the subworkflow execution and monitor from there. We are wondering if there is a more "native" way of doing this. Some requirements for the setup • We must be able to use different images per workflow • Must be frictionless to develop the subworkflows as separate entities. Then upon release, use the latest version in the orchestration flow. Main issue we are running into: how do we serialize and register all of the subworkflows at the same time (so we can import them in the orchestration wf) and use different images for each (bc we have different requirements for each). Any ideas on the topic are welcome, thanks in advance!

silly-toddler-37820

02/20/2025, 4:48 PM

Hi there! I'm on the product side at Union. In Flyte, I believe you can serialize/register everything in your source root directory by using the

--copy

flag in the Pyflyte CLI (docs). However, I don't think you actually need to register everything at the same time you run it - you should be able to create a new execution of an existing workflow (and the same for each subworkflow you want to execute) using

FlyteRemote

(docs). Note that if you want to run FlyteRemote inside a task (like in the case of your higiher-level Flyte workflow), you need to make sure the task is authenticated with Flyte. In terms of observability, agreed this is somewhat limited since Flyte doesn't have a notion that the workflows are related to each other. You can definitely use

FlyteRemote

to fetch the execution status of each of your subworkflows, but there is no meta-DAG in the UI. For your image management, I would look into ImageSpec, which lets you define your requirements in code and specify images at the task level. So you have lots of flexibility there. If you'd be open to looking at Union, we have built a more native way for workflows to trigger each other called Reactive Workflows (launch blog and docs).

strong-quill-48244

02/20/2025, 5:45 PM

Hi John, thanks a lot for your thoughts! Will check

--copy

and if that could be useful 🙂 As mentioned,

FlyteRemote

is our initial thought as well. It's just a shame that we will lose out on observability and have to write a lot of custom logic to orchestrate the workflows. To me, it feels like this kind of thing would be exactly what should be enabled by an orchestration tool. Building and running workflows separately and being able to chain them together in a neat way. I'll have a look at what you have built, looks cool!

silly-toddler-37820

02/20/2025, 5:52 PM

> To me, it feels like this kind of thing would be exactly what should be enabled by an orchestration tool. Building and running workflows separately and being able to chain them together in a neat way. Taking a step back, do you actually need to separate out the subworkflows? It is possible to run workflows of workflows in Flyte: https://docs.flyte.org/en/latest/user_guide/advanced_composition/subworkflows.html Basically if you are okay keeping everything in a single execution id (which it seems like you want to do), you should be able to nest workflows as necessary. Also check out dynamic: https://docs.flyte.org/en/latest/user_guide/advanced_composition/dynamic_workflows.html#dynamic-workflow And, sorry to overload with information, but if you need to register those workflows separately, you can check out reference workflows - this is a way to use an entity that is already registered: https://docs.flyte.org/en/latest/api/flytekit/generated/flytekit.reference_workflow.html

strong-quill-48244

02/20/2025, 6:58 PM

Did not know about reference workflows!! Appreciate that - might just work for us. And good point, in theory we could just slam everything to one workflow. The issues with that would (in my mind) • Multiple devs working with different parts of the pipeline > versioning is painful when workflows get registered for everything when you actually change only one part. • We have very different requirements for different parts of the flow (e.g. gpu/cpu), so we would still need separate images linked to different parts of the orchestrated workflow. Seems a bit extra to have to specify the image to each task in the flow. Running "workflows of workflows" is exactly what we need. But don't want all of the workflows to use the same image. And want to keep it so that we can easily also build and run the workflows separately.

silly-toddler-37820

02/20/2025, 7:02 PM

Flyte does let you set resources at the workflow level (as well as domain, project, and per-task): https://docs.flyte.org/en/latest/deployment/configuration/customizable_resources.html I would need to see if this is also possible to do with images.

strong-quill-48244

02/20/2025, 7:29 PM

We do our infra deployment through argocd and I'm not sure if we can mix the resource configs with the application code 🤔 Do you know if this is possible: Project structure like:

Copy code

/workflows/
../orchestration/
.....orchestrator_wf.py
.....Makefile
.....Dockerfile
../subwf1/
.....subwf1_wf.py
.....Makefile
.....Dockerfile
../subwf2/
.....subwf2_wf.py
.....Makefile
.....Dockerfile

In orchestrator_wf.py we would import wfs from

subwf1_wf.py

and

subwf2_wf.py

and thus we will have full DAGs in UI. In orchestration/Makefile we would 1. Build three images based on Dockerfiles in

orchestration

subwf1

, and

subwf2

2. Package and register them in a way where each workflow would use their own image when launched This is how I would ideally build this but now sure if we can register in that manner. Also, this would then enable development of the individual subwfs without worrying about the main wf.

51 Views

Open in Slack

Previous Next