Flyte enables production-grade orchestration for machine learning workflows and data processing created to accelerate local workflows to production.

Flyte

<https://github.com/flyteorg/flyte/issues/1230|#1230 [Core Feature] Use 'parallelism' in array tasks (and use it to improve performance)>
Issue created by <https://github.com/katrogan|katrogan>
*Motivation: Why do you think this is important?*  
Parallelism can be requested by users to limit the number of concurrent array tasks. It would be nifty if we could use the statically defined parallelism value at run-time to batch up processing of inputs without requiring creating and tearing down a new pod for every invocation of a map task.

*Goal: What should the final outcome look like, ideally?*  
For example, with parallelism == 4 and inputs [1, 2, 3, ..., 15] array task execution could be broken up so that only 4 pods are broken up with the input batched semi-equally amongst them like so:

pod 0: [1, 5, 9, 13]  
pod 1: [2, 6, 10, 14]  
pod 2: [3, 7, 11, 15]  
pod 3: [4, 8, 12]  
(using a mod distribution)

or  
pod 0: [1, 2, 3, 4]  
pod 1: [5, 6, 7, 8]  
pod 2: [9, 10, 11, 12]  
pod 3: [13, 14, 15]  
(as an alternate allocation)

*Describe alternatives you've considered*  
A clear and concise description of any alternative solutions or features you've considered.

*[Optional] Propose: Link/Inline OR Additional context*  
This will require modifying the command line for array tasks in flytekit and pod construction in plugins to handle batching the inputs appropriately.
<https://github.com/flyteorg/flyte|flyteorg/flyte>

<https://github.com/flyteorg/flyte/issues/1230|#1230 [Core Feature] Use 'parallelism' in array tasks (and use it to improve performance)>
Issue closed as not planned by <https://github.com/apps/github-actions|github-actions[bot]>
<https://github.com/flyteorg/flyte|flyteorg/flyte>