If your Flyte task is OOM killed when returning a Pydantic BaseModel object to a dynamic workflow, even with 55Gi memory requested, this may be due to serialization overhead or a bug in how Flyte handles large/complex objects in dynamic workflows. There are known issues with memory growth and OOMs in dynamic tasks, especially when iterating over or returning large objects, or when using complex types like Pydantic models or dictionaries as dynamic task outputs or inputs. This can cause excessive memory usage or even infinite loops, leading to OOM kills regardless of the memory requested
[BUG] Infinite loop and memory growth in dynamic task,
[BUG] nested dynamic won't bind pydantic models or dictionaries as inputs.
A recommended workaround is to serialize the BaseModel to a JSON string before returning it from the task, and then deserialize it in the downstream dynamic workflow. This avoids Flyte's type system issues with complex objects in dynamic contexts. If you are already serializing and still see OOMs, the issue may be with the sheer size of the object or a bug in Flyte's dynamic workflow handling. Also, ensure you are not iterating over promises or large lists of files/objects in the dynamic task, as this can cause runaway memory usage
[BUG] Infinite loop and memory growth in dynamic task.
Would you like more detail or example code for the serialization workaround?
Sources:
-
https://github.com/flyteorg/flyte/issues/5164
-
https://github.com/flyteorg/flyte/issues/5473 š 0 button š 0 button š”
Hint: Mention @runllm in the thread for followups.