https://flyte.org logo
#ask-the-community
Title
# ask-the-community
u

Uria Franko

09/12/2023, 8:59 AM
Hey guys, Following deployment guide I got to a point with a proper cluster and flyte console running but the nodes (using the guide) includes flyte.org/node-role=workers toleration but the pods scheduled from flyte doesn’t… how can I fix it to make the flyte pods to include that toleration?
k

Ketan (kumare3)

09/12/2023, 1:12 PM
Is this for execution pods?
Which guide are you using?
u

Uria Franko

09/12/2023, 1:13 PM
https://github.com/unionai-oss/deploy-flyte/tree/main/environments/aws using this guide, fixed by adding
Copy code
default-tolerations:
          - key: '<http://flyte.org/node-role|flyte.org/node-role>'
            operator: 'Equal'
            value: 'worker'
            effect: 'NoSchedule'
q

Quentin Chenevier

10/02/2023, 1:15 PM
@Uria Franko I'm facing the same issue (job requesting gpu do not trigger autoscaling with message such as:
pod didn't trigger scale-up: 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had taint {<http://flyte.org/node-role|flyte.org/node-role>: worker}, that the pod didn't tolerate
) , and I'm trying to figure out where I should put this additional config. Is it in the helm chart values ? Did you use the syntax suggested in this doc page (which uses
resources-toleration
instead of
default-tolerations
) ? It would be awesome if you could share a bigger yaml snippet 🙂
Found it:
Copy code
configuration:
  inline:
    plugins:
      k8s:
        default-tolerations:
          - key: 'flyte.org/node-role'
            operator: 'Equal'
            value: 'worker'
            effect: 'NoSchedule'