I am evaluating Flyte but we have some challenging schedulin Flyte #flyte-support

I am evaluating Flyte, but we have some challengin...

clever-shampoo-31949

01/22/2025, 2:06 PM

I am evaluating Flyte, but we have some challenging scheduling requirements. Are these supported? We have a hundred to a few thousands tasks that need to update a database. Each task updates multiple partitions of the database, the list of partitions (they have a clear id) is known when the task is added to the task graph. Only 1 task may update a partition at a time. Each task needs ~half an hour. We want to run as many tasks in parallel as possible. • Can Flyte schedule tasks such that the mutually exclusive tasks do not run at the same time? • Is it possible to optimize scheduling by scheduling tasks with the most in-common partitions first?

average-finland-92144

01/22/2025, 5:31 PM

Hey @clever-shampoo-31949

• Can Flyte schedule tasks such that the mutually exclusive tasks do not run at the same time?

IIUC you'd need to set concurrency limits at the launchplan or workflow level. This is being spec'd out as we speak (see RFC and feel free to comment there).

freezing-airport-6809

01/23/2025, 6:08 AM

@average-finland-92144 @clever-shampoo-31949 I think for scheduling one task of a kind you will have to use cache serialization

freezing-airport-6809

01/23/2025, 6:09 AM

Here are the docs https://docs.flyte.org/en/latest/user_guide/development_lifecycle/cache_serializing.html

clever-shampoo-31949

01/23/2025, 8:29 AM

Thanks @average-finland-92144 and @freezing-airport-6809 . We want to run as much in parallel as compute is available, so setting a concurrency limit is not what we're after. But the cache serialization thing looks like what we need. (The name is strange though, will read the docs a bit more to understand.)

clever-shampoo-31949

01/23/2025, 8:37 AM

From the cache serializing docs:

Using this mechanism, Flyte ensures that during multiple concurrent executions of a task only a single instance is evaluated and all others wait until completion and reuse the resulting cached outputs.

So unfortunately, this is not what we need. All our tasks need to run, even if they touch the same database partition.

clever-shampoo-31949

01/23/2025, 8:41 AM

@average-finland-92144 This could work with a concurrency limit per database partition. I will add a comment to the RFC. (There are many partitions though.)

freezing-airport-6809

01/23/2025, 3:41 PM

That is at the workflow level

freezing-airport-6809

01/23/2025, 3:42 PM

There is another mechanism in Flyte - but not documented

5 Views

Open in Slack

Previous Next