Is there a way to do multiple map tasks in a row without doing a coalesce/resplit in the middle. My thought is to construct a dataset, chunk it up, build data modules out of each chunk (map task 1) and then run ML inference on each data module (task 2)
a
average-finland-92144
04/29/2025, 7:05 PM
AFAICT you should be able to chain map_task 1 to map_task 2
average-finland-92144
04/29/2025, 7:07 PM
does that help?
Also I guess your inference use case has some room for latency? Flyte MapTasks, in their current version, still suffer from container boot up penalty