Hi all, I'm launching a large number of external subworkflows, using reference launch plans. This is causing cluster load issues because each subworkflow gets it's own max_parallelism. Is there a way to force the workflows to share one resource limit, or alternatively to override the max parallelism for the subworkflows when i launch them with the reference launchplan?
f
freezing-airport-6809
05/15/2024, 4:02 PM
I would love to understand what type of load issues are you seeing? Is it just that you have too many pods and cabinet keep up because you’re using reference launch plans. It is meant to be distinct and isolated units, though you can control the parallelism through the launch plan.
h
handsome-airline-36833
05/15/2024, 4:03 PM
Ah it's just some rate limit for fetching container images IIUC
handsome-airline-36833
05/15/2024, 4:04 PM
Unrelated to flyte as such
f
freezing-airport-6809
05/15/2024, 4:10 PM
Yes this can happen. There are ways to
Increase the container registry throttle limits.
We did some tricks to deploy a base image everywhere
Also if you are launching lots of small tasks - union has a cool way to reuse containers