Hello everyone We ve been scaling up our workflows recently Flyte #flyte-support

Hello everyone! We've been scaling up our workflow...

wide-soccer-37846

11/29/2024, 10:31 AM

Hello everyone! We've been scaling up our workflows recently and running into an ongoing issue with memory usage in the flyte-binary (v1.13.3), so I've been investigating. As the workflows run, the memory of the flyte-binary pod steadily increases, sometimes exceeding its limits and crashing. I think this is expected, and we can try to increase the memory available to mitigate the crashes. However, I noticed that when the workflow finishes, it looks like some memory isn't released, which means no matter how much memory we allocate to the pod, it will eventually crash. If anyone has any workarounds or fixes for this I'd be grateful.

freezing-airport-6809

11/29/2024, 5:16 PM

Is that 40GB?

freezing-airport-6809

11/29/2024, 5:16 PM

That sores not make sense

wide-soccer-37846

11/29/2024, 5:37 PM

Yes, this was the other thing I was unsure about, the memory usage seems very high, but I don't really have any context for what it should be.

freezing-airport-6809

12/02/2024, 4:52 AM

something is wrong, either prometheus metrics are bloating memory

freezing-airport-6809

12/02/2024, 4:52 AM

can you turn of metrics and see?

wide-soccer-37846

12/04/2024, 10:45 AM

Does flyte-binary have metrics enabled by default? I haven't enabled them explicitly - the plots above just come from running

top pods

freezing-airport-6809

12/05/2024, 12:59 AM

i think it does have by default

freezing-airport-6809

12/05/2024, 12:59 AM

we should probably turn them off

2 Views

Open in Slack

Previous Next