Hey Folks! I have a use-case where I'd like to bac...
# ask-the-community
v
Hey Folks! I have a use-case where I'd like to backfill/rerun from middle of the workflow. A simple example:
Copy code
If I have one workflow(A) that has three sub workflows(X, Y, Z). 
- Y depends on X
- Z depends on Y

Essentially, X -> Y -> Z

Let's assume X, Y & Z were successful. However, subworkflow Y had generated bad data i.e. has bugs and needs to be re-run. 

My goal is to re-run Y and it's downstreams i.e. want Z to pick up the latest partitions of Y.
This would be similar to
Clear Downstreams
option in Airflow. Does something like this exist in Flyte? I looked at recover, but this only seems to be for Failed tasks.
cc: @Andrés Gómez Ferrer
k
Did you use caching? If not then simply use reference workflows to construct one on the fly and run
Open up a jupyter notebook use remote.fetch_launchplan and then simply compose and register from jupyter off you go
v
Right, so I can potentially invalidate the cache and run it, but it's still quite a bit of manual work if you have 200+ tasks in a workflow and you want to run it from a task inbetween. Ideally, I'd like to avoid using jupyter notebook and want to find an alternative in a production setting, something similar to how airflow offers it with the click of a button?
k
This is a feature request
I guess you could run one task and (single task) and then run the entire workflow
This would cause running subsequent tasks
Automatically
v
ahh got it, thank you for the support! 🙂
k
Can you please come back and tell if it works, also I will try to reproduce just to ensure it works (it should) One problem I see is in ReRun task, does not automatically copy the inputs to the launch form 😞 - please file an issue
v
I guess you could run one task and (single task) and then run the entire workflow
For the above, you're hoping running a single task invalidates the subsequent tasks as well, right? And when I re-run the workflow, it should run all the downstreams as well?
k
Yup