Hey folks, anyone know how to override task defaul...
# ask-the-community
r
Hey folks, anyone know how to override task default and limit resources using the
flyte-binary
helm chart with
eks-starter.yaml
?
I am getting this error pretty much
Copy code
"Requested MEMORY limit [2Gi] is greater than current limit set in the platform configuration [1Gi]. Please contact Flyte Admins to change these limits or consult the configuration"}"
k
The defaults are in backed config and in the helm chart. But you can also update it for a project using flytectl https://docs.flyte.org/projects/flytectl/en/latest/gen/flytectl_update_task-resource-attribute.html
r
I think there is a bug with that https://github.com/flyteorg/flyte/issues/3065
I ran it but the limit still shows at 1Gi
Copy code
flytectl get task-resource-attribute -p flytesnacks -d development
{"project":"flytesnacks","domain":"development","defaults":{"cpu":"2","memory":"2Gi"},"limits":{"cpu":"4","memory":"8Gi"}}
Then using
pyflyte run --remote ...
debug_error_string = "UNKNOWN:Error received from peer ipv6:%5B::1%5D:8089 {grpc_message:"Requested MEMORY limit [2Gi] is greater than current limit set in the platform configuration [1Gi]. Please contact Flyte Admins to change these limits or consult the configuration", grpc_status:3, created_time:"2023-02-24T15:51:00.19866+01:00"}"
k
Please delete the default
r
Copy code
flytectl get task-resource-attribute -p flytesnacks -d development
{"project":"flytesnacks","domain":"development","limits":{"cpu":"4","memory":"8Gi"}}
Still getting the same error as before
looks like project/domain resources don't overwrite the default that lives somewhere I can't find
k
Did you restart the backed pod
r
I can try, but is that just a workaround?
Yep I did it isn't working
Please let me know if there is a way to control the global default limits from the flyte-binary chart
k
I meant in the Flyte-binary helm chart
r
This is what I used
Copy code
kubectl -n flyte scale deployment flyte-backend-flyte-binary --replicas=0
# Wait 1 minute
kubectl -n flyte scale deployment flyte-backend-flyte-binary --replicas=1
I don't think
flytectl
changes anything in the helm chart itself...
k
You are right, give me a moment- I am trying to find. The problem is it is not shown in th default values. It is absolutely possible, but I want to show you an example
r
Yes, please, anything that would work
k
r
Is there a way for me to change it for the flyte-binary chart?
k
There is, since I have not played much with the binary hom chart bear with me
r
absolutely, no worries, I appreciate you working with me on this either way
k
No we should - I actually wanted no defaults - by default 😀
r
Yeah, default limits are limiting...
k
Now you got it 😝
Jokes aside sorry for the trouble
r
no worries
So would that look something like this
Copy code
configuration:
  task_resource_defaults:
    task_resources:
      defaults:
        cpu: 1
        memory: 2Gi
      limits:
        cpu: 4
        memory: 8Gi
?
k
I think so - not on my computer else would have tried
r
ok one sec I can try now
k
It’s still early for me - have not had my morning coffee
Go for it
If not then someone better will help in a bit
r
no worries at all I can still test it out and let you know then find out later, just testing things out not in a hurry
j
hi @Reda Oulbacha
it should be under configuration.inline
r
sweet, I'm about to try, trying to fix a few issues along the way
j
now that im at a computer:
Copy code
configuration:
  inline:
    task_resource_defaults:
      task_resources:
        defaults:
          cpu: 1
          memory: 2Gi
        limits:
          cpu: 4
          memory: 8Gi
r
yep I have the same thing here... ran a
helm upgrade flyte-backend ...
but the changes weren't reflected, even though the deployment restarted
j
you still have the same error?
ah my bad
r
so I'm looking at the logs of the service and I see this
Copy code
2023/02/24 16:39:13 /go/pkg/mod/gorm.io/driver/postgres@v1.2.3/migrator.go:106
[0.495ms] [rows:1] SELECT count(*) FROM pg_indexes WHERE tablename = 'artifacts' AND indexname = 'artifacts_dataset_uuid_idx' AND schemaname = CURRENT_SCHEMA()
{"metrics-prefix":"flyte:","certDir":"/var/run/flyte/certs","localCert":true,"listenPort":9443,"serviceName":"flyte-backend-flyte-binary-webhook","servicePort":443,"secretName":"flyte-backend-flyte-binary-webhook-secret","secretManagerType":"K8s","awsSecretManager":{"sidecarImage":"<http://docker.io/amazon/aws-secrets-manager-secret-sidecar:v0.1.4|docker.io/amazon/aws-secrets-manager-secret-sidecar:v0.1.4>","resources":{"limits":{"cpu":"200m","memory":"500Mi"},"requests":{"cpu":"200m","memory":"500Mi"}}},"vaultSecretManager":{"role":"flyte","kvVersion":"2"}}
j
Copy code
configuration:
  inline:
    task_resources:
      defaults:
        cpu: 1
        memory: 2Gi
      limits:
        cpu: 4
        memory: 8Gi
try that
r
oh ok one sec
j
guilty of copy pasta 🙂
r
Copy code
configuration:
...
  inline:
    task_resources:
      defaults:
        cpu: 1
        memory: 2Gi
      limits:
        cpu: 4
        memory: 8Gi
I have that but...
one moment
r
It looks like it worked
When I inspect the pod I see
Copy code
Limits:
      cpu:     1
      memory:  2Gi
    Requests:
      cpu:     1
      memory:  2Gi
thank you!!!
j
👍
r
I tried to run the hello world workflow after deploying
flyte-binary
as per the doc, and ran into
OOMKilled
, then tried to increase the resources with the
@task
decorator, and then ran into this resource limit
I feel like everyone trying to go the same path as me will end up here
j
indeed. i think the fix here is to not have flyte inject a default value in code if not specified in the configuration?
r
Yes that would be amazing!
k
Yes this was indeed the idea - cc @Yee / @Eduardo Apolinario (eapolinario) we should do this.
y
yeah this was the idea. we will look into this
k
@Yee currently flyteadmin has some hardcoded defaults
@Ketan (kumare3) can you take a look?
the only thing is that when you launch a pod manually via kubectl and you don’t pass resources, you get
resources: {}
I’m only like 99% sure that 0s are equivalent to
{}
b
@jeev https://flyte-org.slack.com/archives/CP2HDHKE1/p1677257099888829?thread_ts=1677247518.480459&amp;cid=CP2HDHKE1 <- this no longer works for me. i had this in a shell script whenever i startup a flytectl demo sandbox, and i’m now getting:
Copy code
Error from server (NotFound): configmaps "sandbox-flyte-binary-config" not found
and when checking:
Copy code
(flyte) briant@Ashriels-MacBook-Pro flyte-test % k get cm -n flyte                  
NAME                                             DATA   AGE
flyte-sandbox-cluster-resource-templates         1      16m
flyte-sandbox-config                             5      16m
flyte-sandbox-docker-registry-config             1      16m
flyte-sandbox-extra-cluster-resource-templates   0      16m
flyte-sandbox-extra-config                       0      16m
flyte-sandbox-proxy-config                       1      16m
kubernetes-dashboard-settings                    0      16m
kube-root-ca.crt                                 1      16m
(flyte) briant@Ashriels-MacBook-Pro flyte-test % k get deployment -n flyte
NAME                                 READY   UP-TO-DATE   AVAILABLE   AGE
flyte-sandbox-proxy                  1/1     1            1           17m
flyte-sandbox-kubernetes-dashboard   1/1     1            1           17m
flyte-sandbox-docker-registry        1/1     1            1           17m
flyte-sandbox-minio                  1/1     1            1           17m
flyte-sandbox                        1/1     1            1           17m
i suppose once this gets merged and released, we wouldn’t have to manually increase the resource limits anymore
j
right. it’s just a name change @Brian Tang. you want flyte-sandbox-config. that’s the new sandbox-flyte-binary-config. but also, if you are using the latest sandbox and flytectl, there is a much easier way now to do this. you can keep a config file at ~/.flyte/sandbox/config.yaml that will get loaded at sandbox initialization. basically move the contents of 100-overrides.yaml from the snippet above into ~/.flyte/sandbox/config.yaml and do “flytectl demo reload” once if your sandbox is already up, or do “flytectl demo start”
228 Views