We're attempting to debug a Flyte workflow on EKS where our pods complain there isn't enough ephemeral storage. Is it possible to get the pod spec that is issued to Kubernetes? The details on the "Task Details" link look close to what the pod would issue but I wondered if that's everything
f
freezing-airport-6809
06/10/2025, 1:52 PM
You can add more ephemeral storage
freezing-airport-6809
06/10/2025, 1:52 PM
In resources add it
c
crooked-holiday-38139
06/10/2025, 2:30 PM
We've done that now, and that's gotten us passed one hurdle. I suppose the basic iteration loop if is something fails on Kubernetes knowing the pod spec means I can reproduce the pod invocation manually which is helpful for debugging. Re-running the workflow also works, but having a pod spec makes it easier to hand off to one of my devops engineers.
f
freezing-airport-6809
06/10/2025, 2:30 PM
Yes this is definitely something that’s coming
freezing-airport-6809
06/10/2025, 2:31 PM
In Flyte v1 sadly the storage is etcd which is limited to 1.5mb per object so storing things is hard
c
crooked-holiday-38139
06/10/2025, 2:34 PM
Fair enough, ultimately we've now got more familiar with intercepting the pod scheduling so we've been able to determine roughly how it works, but part of the game is catching it before the resource gets deleted đŸ™‚
Basic issue we were having is trying to launch a pytorch with cuda image that weighs about 8Gb on a ephemeral storage that was too small (~20Gb).