Hi all. A while back I mentioned some pain points ...
# ask-the-community
Hi all. A while back I mentioned some pain points I feel related to scheduled workflow executions. Some of these can be filed under "things I miss about Airflow", but they are more generally related to backfills and scheduled launch plans. I was asked to write up a list and put it in a feature request. But I thought I'd start here to gather any other ideas before getting to a specific set of feature requests. Here are a few thoughts (not at the level of specificity of a design doc): • Problem: When creating a new scheduled launch plan, I often want to backfill to some given date. ◦ Potential approach: Include an optional argument
that indicates how far back in the past the time based view should go. All past workflow executions will be scheduled when this is registered, perhaps sequentially. ◦ Defaults to a value based on the time the schedule was activated. That is, by default there is no backfill done. ◦ This determines the number of backfill "slots" to be executed. • Problem: The workflow UI shows the list of workflow executions based on the time they were run. But for scheduled launch plans, I want to see a view based on the scheduled times. How do I know if a particular failed schedule workflow execution ever had a subsequent success? ◦ Potential approach: Have a "schedule time" based view for each schedule. ◦ Visually display the status of each execution slot in the backfill by the status of the last execution. E.g., success, failure, running, etc. ◦ Click on an execution slot to relaunch a failed or successful task or launch a task that has not yet run. • Problem: Running backfills for a scheduled launch plan is tedious. Is there a better way than just a lot of clicking in the UI? ◦ Potential approach: Provide backfill capabilities in the UI for scheduled launch plans. For example, select a launch plan with a schedule, select a start datetime and end datetime, then run. ◦ Provide a few concurrency and backfill settings. (Borrowed from Airflow) ◦
sets the maximum number of executions running concurrently. Setting to 1 would backfill sequentially, one at a time. • Problem: In the workflow UI, in the section "All Executions in the Workflow", there is no easy way to distinguish between scheduled runs and manually triggered runs. And the inputs to the execution are not readily available in this view. ◦ Potential approach: Enable a filtered view of only those executions that were initiated by a schedule (rather than by the user launching via the UI). ◦ Filter by the name of the launch plan used to trigger the workflow execution. ◦ Include columns for the value of the time based argument passed to the workflow. And or include the name of the launch plan used to trigger the execution. I'd be happy to file these as feature requests, but I'd welcome any feedback before I get started.
@Dennis O'Brien thank you for the feedback. A while back we were asking about how we can improve. We did add one thing want to get your take on this. Not yet documented so early - but very simple feature ‘Pyflyte backfill’ https://github.com/flyteorg/flytekit/pull/1420 Also what we are working on, Flyte execution tags to allow arbitrary groupings, like grouping by schedule tags. Maybe we should show the executions view under the launchplan - already filtered? https://github.com/flyteorg/flyte/pull/3320
But cc @Niels Bantilan
I do want to update the Ui a little bit. The problem is, holes. How do holes even happen?, is it because you turned off a specific time instance? More on this, for scheduled launchplans should Ui simply interpolate and he possible previous time ranges? And then fill them with like a grays out bars
@Rahul Parundekar ^
I really love this conversation too. Please let’s help us shape the data engineering part of the product
I think we can organize these efforts in an epic Improving Launchplans and Backfill UX. @Dennis O'Brien I created a stub RFC discussion here, please go ahead and edit it as you see fit! Your initial thoughts are great, but feel free to add more detail/ideas/solution proposals to the problems you outlined.