https://flyte.org logo
#ask-the-community
Title
# ask-the-community
n

Nicholas Roberson

05/10/2023, 7:02 PM
Question for the community: Is it possible or has anyone been able to achieve limiting of parallel tasks executions across multiple workflows? For example
Workflow A
runs
Task B
10 times. If we run
Workflow A
100 times it will run a total of 1,000
Task B
's. Is there a way to make it so only 500 of
Task B
max can run at any single time? Also is there a limit to how long a task can sit and wait for a spot to run before the job fails or is dropped? For example an instance of
Task B
is the thousandth one, how long can it wait before all the ones before it are done and a spot opens up for it to execute before its kicked out, or is there even a limit?
x

Xinzhou Liu

05/10/2023, 11:07 PM
Similar to my previous question
s

Samhita Alla

05/11/2023, 6:54 AM
Is it possible or has anyone been able to achieve limiting of parallel tasks executions across multiple workflows?
Should be possible with
maxParallelism
. https://docs.flyte.org/en/latest/deployment/configuration/performance.html#worst-case-workflows-poison-pills-max-parallelism
Also is there a limit to how long a task can sit and wait for a spot to run before the job fails or is dropped?
This is configurable: https://docs.flyte.org/en/latest/deployment/configuration/generated/flyteadmin_config.html#config-defaultdeadlines. As per https://github.com/flyteorg/flyte/issues/2933 issue,
node-active-deadline
defaults to 48h but the docs states that the default value is 0s. @Dan Rammer (hamersaw), can you shed some light on this?
d

Dan Rammer (hamersaw)

05/11/2023, 1:33 PM
Thanks for the ping @Samhita Alla! To limit parallel execution of tasks across multiple workflows the only construct we currently have
cache_serialize
which, given a cachable task, ensures that only a single instance of the task runs at a time. If two instances are started simultaneously, one will wait until the other is finished and reuse the cached results. It doesn't sound like this is quite what you're looking for, we would have to think on this - but I suspect an elegant solution could be a bit of work. Regarding the
node-active-deadline
, the previous default value was
48h
but users with long-running tasks exhibited some concern over unexpected terminations. So we decided that this default should be `0s`(meaning unlimited).
3 Views