Len Strnad
11/02/2023, 1:11 PMmap_task
. Sometimes we have many many outputs from a previous task as inputs to map_task
, which we have found to be slow on occasion and I think we sometimes see limitations in max number of inputs/outputs (is there a limit?). I have been enjoying using dataframes as outputs/inputs for tasks and was wondering if it would ever make sense to add dataframe input/output support for map_task
?
Followup thought: would it ever make sense to create batch support for map_task
? For example, a batch size of 100 would mean that a single pod would stay up and iterate over 100 input elements. I suppose this can already easily be accomplished by constructing the tasks/inputs accordingly.Samhita Alla
map_task
Do you mean a list of dataframe inputs?
> For example, a batch size of 100 would mean that a single pod would stay up and iterate over 100 input elements.
You should be able to accomplish this within a single task, just by looping over the batch.Len Strnad
11/03/2023, 12:47 PMSamhita Alla
I think a list of dataframes works at the moment no?It has to, yes.
I mean using a row in a df as an input.Gotcha. I'm not sure if that's something we can consider as a high priority item. If you're willing to contribute, please feel free to create an issue, and the team will let you know what they think of it.
Len Strnad
11/03/2023, 1:03 PM