<#1230 [Core Feature] Use 'parallelism' in array t...
# flytekit
c
#1230 [Core Feature] Use 'parallelism' in array tasks (and use it to improve performance) Issue created by katrogan Motivation: Why do you think this is important? Parallelism can be requested by users to limit the number of concurrent array tasks. It would be nifty if we could use the statically defined parallelism value at run-time to batch up processing of inputs without requiring creating and tearing down a new pod for every invocation of a map task. Goal: What should the final outcome look like, ideally? For example, with parallelism == 4 and inputs [1, 2, 3, ..., 15] array task execution could be broken up so that only 4 pods are broken up with the input batched semi-equally amongst them like so: pod 0: [1, 5, 9, 13] pod 1: [2, 6, 10, 14] pod 2: [3, 7, 11, 15] pod 3: [4, 8, 12] (using a mod distribution) or pod 0: [1, 2, 3, 4] pod 1: [5, 6, 7, 8] pod 2: [9, 10, 11, 12] pod 3: [13, 14, 15] (as an alternate allocation) Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered. [Optional] Propose: Link/Inline OR Additional context This will require modifying the command line for array tasks in flytekit and pod construction in plugins to handle batching the inputs appropriately. flyteorg/flyte