Hey I ve got a workflow where each task is relatively light Flyte #flyte-support

Hey, I've got a workflow where each task is relati...

gorgeous-caravan-46442

10/06/2025, 11:00 PM

Hey, I've got a workflow where each task is relatively light in compute/data needs, but the DAG itself is heavy. I have a workflow inside a dynamic inside a dynamic. Specifically, my outer level is a dynamic which creates 50 dynamics, this middle level is a dynamic which each create 35 workflows. The inner workflow itself is fitting a relatively simple ML model (think XGBoost). When I call this, at larger scales I often

[1/1] currentAttempt done. Last Error: USER::Pod was rejected: The node had condition: [DiskPressure…

. I've tried bumping the disk space on the nodepool to something large, but this does not help. Using lower max-parallelism helps to some extent, but I'd like these to execute in parallel at scale. Is this a known issue with nested dynamics? Is there something I can improve in my flyte deployment? Is this something that won't be an issue in flyte 2.0? This post by @clean-glass-36808 about deserializing dynamic workflows massively increasing CPU usage is possibly related https://flyte-org.slack.com/archives/CP2HDHKE1/p1753231403202179

clean-glass-36808

10/06/2025, 11:13 PM

The issue that I ran into was purely on the Flyte Propeller (state machine) side of things. I don't think it caused disk pressure since I'm pretty sure everything is done in-memory with no buffering to disk.

gorgeous-caravan-46442

10/07/2025, 3:37 AM

Interesting. Is there any part of dynamic that would cause disk pressure? The tasks themselves read data in/out, but I've given each task plenty of resources (much more storage than the size of the data). But there might be something else on the flyte side that is writing to disk

freezing-airport-6809

10/07/2025, 4:51 AM

It should not cause disk pressure at all

freezing-airport-6809

10/07/2025, 4:51 AM

disk pressure is because of not using ephermeral storage and downloading a lot of data across many containers

freezing-airport-6809

10/07/2025, 4:53 AM

I think your node or your kuberenetes node configuration may also be wrong or you have a very small root volume and using the root volume for containers - could be many things

freezing-airport-6809

10/07/2025, 4:53 AM

Also @gorgeous-caravan-46442 I dont know if disk pressure can be improved, but your entire dynamics etc can be greatly greatly simplified with flyte 2

freezing-airport-6809

10/07/2025, 4:54 AM

here is an example, https://github.com/flyteorg/flyte-sdk/tree/main/examples/ml

freezing-airport-6809

10/07/2025, 4:54 AM

https://github.com/flyteorg/flyte-sdk/tree/main/examples/higher_order_patterns

gorgeous-caravan-46442

10/07/2025, 6:55 PM

Hmm interesting. Given the workflow, it might be that I'm downloading a lot of data across many containers. Can you suggest what part of the cluster I should bump? See if that works

gorgeous-caravan-46442

10/07/2025, 6:55 PM

Great to hear flyte 2.0 simplifies it

freezing-airport-6809

10/07/2025, 9:38 PM

You will have to see how your cluster and container disks are configured

freezing-airport-6809

10/07/2025, 9:38 PM

I recommend using ephemeral storage specification

gorgeous-caravan-46442

10/07/2025, 9:54 PM

is there some specific kubectl setting I can check to see what is currently used?

freezing-airport-6809

10/08/2025, 2:57 PM

sadly kubectl does not show disk utilization (even in kubectl top)

freezing-airport-6809

10/08/2025, 2:57 PM

but sometimes it does - if you configure it

freezing-airport-6809

10/08/2025, 2:57 PM

you can try

kubectl top <node-name>

2 Views

Open in Slack

Previous Next