Hi there, my pods are getting killed with OOM. I t...
# flyte-support
s
Hi there, my pods are getting killed with OOM. I tried increasing the limits, but it still defaults to the presets. I am trying this toy example:
Copy code
from flytekit import task, Resources, workflow


@task(
    requests=Resources(
        cpu="2",
        mem="0.5Gi",
    ),
    limits=Resources(
            cpu="2",
            mem="0.5Gi",
        ),
)
def foo():
    print('task')

@workflow
def my_wf():
    foo()
    foo().with_overrides(
        requests=Resources(
            cpu="1",
            mem="2Gi",
        ),
        limits=Resources(
            cpu="1",
            mem="4Gi",
        ),
    )
Here is the output of limits on the pod:
Copy code
NAMESPACE                 POD                                             CONTAINER                      MEM_REQ   MEM_LIM   CPU_REQ   CPU_LIM
flyte                     flyte-backend-flyte-binary-548f5d59fc-ln6q8     flyte                          <none>    <none>    <none>    <none>
flytesnacks-development   azpd24c4qpc2w2jlqvhz-n0-0                       azpd24c4qpc2w2jlqvhz-n0-0      512Mi     512Mi     2         2
flytesnacks-development   azpd24c4qpc2w2jlqvhz-n1-0                       azpd24c4qpc2w2jlqvhz-n1-0      1Gi       1Gi       1         1
Here is the task resource config:
Copy code
task_resources:
      defaults:
        cpu: 500m
        memory: 10Gi
As you see, I am setting a memory default of 10Gi, but the pod only gets allocated max of 1Gi, which I believe is the default. Is there somewhere else , I need to update ?
d
did you use the latest propeller?
it's fixed 4 month ago I think
upgrade your propeller's image to the latest version.
s
Oh. let me check. Thanks
Han-Ru, I am using flyte-binary, and I have installed the latest version, using Helm Chart, as specified in the “Hard Way”.
Screen Shot 2024-09-17 at 8.49.25 AM.png
d
cc @average-finland-92144, can you help him use the latest flytepropeller image? I haven't had experience with the "Hard Way".
I think you have to update your propeller's deployment to the latest
after you set your config, did you restart your propeller?
s
Hi Han-Ru, I believe I solved it. The issue was that in resource config, we need to specify both the defaults as well as limits, else it is ignoring. Like so: task_resources: defaults: cpu: 500m memory: 10Gi limits: cpu: 500m memory: 100Gi
Thanks for your help.
a
@salmon-flower-36598 uh good catch. Time to update that section of the guide. In the task resource configuration itself, what I don't think is needed is to specify limits, as by default Flyte should try to do
requests=limits
, and that's better for the K8s scheduler
s
David, I tried that, but Flyte is not resepcting the requests=limits, and is reverting to defaults. Without override, it shows flightadmin needs to adjust limits
a
right, I mean, under
task_resources
it seems you need to set both. After doing so, what's the behavior of just setting
requests
in the
@task
decorator?
s
It simply throws this error: Details: Requested CPU limit [2] is greater than current limit set in the platform configuration [500m]. Please contact Flyte Admins to change these limits or consult the configuration
a
got it, thanks 🙂
d
you have to make the limit of cpu in the admin's config larger than your request limit
this is how it works in my memory
a
I think we need to document this better. Looking at this old thread, seems like the difference between
defaults
requests
and
limits
is not well explained