<@U06RKAMK0UC> Is there guidance on the maximum am...
# ask-ai
b
@ancient-wolf-19325 Is there guidance on the maximum amount a workflow should scale out using map_task? I’m seeing issues in the 500-1000 task range, and want to know if the team has put out recommendations on maximum scale
a
There is no official hard limit documented for Flyte's map_task fanout, but community reports and issues indicate that scaling beyond 1,000 mapped tasks can lead to failures or panics, especially due to backend storage and Kubernetes array job plugin limitations. For example, attempts to fan out to 10,000 tasks have resulted in runtime errors and system panics, and even 1,000+ can cause issues depending on cluster configuration and etcd storage (Flyte Issue #2739, Flyte Issue #3709). The default maxArrayJobSize and etcd storage constraints are common bottlenecks. The Flyte team recommends using map_task for large-scale parallelism but suggests keeping the fanout to a few thousand at most, and to use dynamic workflows or chunking if you need to go higher. If you are seeing issues at 500-1,000, check your cluster's resource limits and the maxArrayJobSize setting. For very large fanouts, consider breaking the workload into smaller batches or using dynamic workflows as a workaround (Flyte Map Tasks Guide, Flyte Blog). Would you like more detail on configuration tuning or workarounds for large fanouts? Sources: - Flyte Issue #2739 - Flyte Issue #3709 - Flyte Map Tasks Guide - Flyte Blog πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.
b
The issues I’m seeing don’t appear to be related to the k8s cluster or storage limits - rather the Flyte UI (react Array Buffer issues). Is this consistent with what the community has found? @ancient-wolf-19325
a
Yes, UI issues (such as React Array Buffer errors or unresponsiveness) when displaying large map_task fanouts are a known problem in the Flyte community. The Flyte Console can become laggy or even crash when rendering workflows with hundreds or thousands of nodes, due to the way it loads and displays execution data for each mapped task (Flyte Issue #4636, Flyte Issue #4531). This is separate from backend or Kubernetes resource limits. Recent UI updates have improved performance, but very large fanouts can still cause problems. The team has discussed implementing pagination or lazy loading to address this, but for now, large-scale map_task executions may still overwhelm the UI. Would you like more detail on UI workarounds or the current state of Flyte Console improvements? Sources: - Flyte Issue #4636 - Flyte Issue #4531 πŸ‘ 0 button πŸ‘Ž 0 button πŸ’‘ Hint: Mention @runllm in the thread for followups.