We're running into `terminated with exit code (137...
# ask-the-community
We're running into
terminated with exit code (137). Reason [OOMKilled].
pod failures when trying to run simple hello world tasks. We have flyte-binary deployed via its helm chart. I see the task pod is being spun up with a default memory limit of 200mb. I'm having trouble tracking down how to set the default pod resource spec to adjust the memory limit of the pod.
you can use pod template yeah but i think that’s not the root cause here - how is your flyte deployment configured? you’re using the flyte-binary helm chart?
the old charts had task resources specified
but we removed that in the new chart
basically those would’ve ended up here in the admin config
but we should’ve removed that in the new chart.
you can add that back in the “inline” section of the new chart which shows up here
since for single binary we now have to specify which of the flyte components config entries are for
Hi @Yee, yes we're using the flyte-binary chart. I tried adding the task_resources object to the inline object in the values.yaml:
Copy code
        cpu: 100m
        memory: 100Mi
        storage: 100Mi
        memory: 1Gi
this appears to have set the defaults appropriate but the limit for memory doesn't appear to be set to 1Gi and instead just gets set to 100Mi to match default request.
Ideally I would be able to set lower CPU/Memory requests compared to their limits
actually a bit confused… you didn’t have to add the
key to the config?
No, when I did that it didn't work. Only worked when added like I showed above
i see
so about the other issue…
Default values get injected as the task requests and limits when a task definition omits a specific resource. Limit values are only used as validation. Neither a task request nor limit can exceed the limit for a resource type.
AH I see.
On a side note, any idea how I can speed up task latency? After resolving the memory request issue, these very simple hellow world tasks take ~2 mins to complete. I can see the pod gets spun up and completes but Flyte appears to have a latency in completing a task built in.
i’m gonna queue @Dan Rammer (hamersaw) on this one.
dan is there a beginner’s guide to flyte tuning somewhere?
i've tried lowering both workflow-reeval-duration and downstream-eval-duration, but regardless a task, even run on its own, takes ~2 minutes no matter what
I'm going to spin this q up in another thread since it veered off the original ask. Thank you for the help!
@Taylor Stout can you direct me to the new thread? Would be happy to discuss!
Sure. I just tagged you in a new thread.