Hey guys, Following deployment guide I got to a po...
# ask-the-community
u
Hey guys, Following deployment guide I got to a point with a proper cluster and flyte console running but the nodes (using the guide) includes flyte.org/node-role=workers toleration but the pods scheduled from flyte doesn’t… how can I fix it to make the flyte pods to include that toleration?
k
Is this for execution pods?
Which guide are you using?
u
https://github.com/unionai-oss/deploy-flyte/tree/main/environments/aws using this guide, fixed by adding
Copy code
default-tolerations:
          - key: '<http://flyte.org/node-role|flyte.org/node-role>'
            operator: 'Equal'
            value: 'worker'
            effect: 'NoSchedule'
q
@Uria Franko I'm facing the same issue (job requesting gpu do not trigger autoscaling with message such as:
pod didn't trigger scale-up: 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had taint {<http://flyte.org/node-role|flyte.org/node-role>: worker}, that the pod didn't tolerate
) , and I'm trying to figure out where I should put this additional config. Is it in the helm chart values ? Did you use the syntax suggested in this doc page (which uses
resources-toleration
instead of
default-tolerations
) ? It would be awesome if you could share a bigger yaml snippet 🙂
Found it:
Copy code
configuration:
  inline:
    plugins:
      k8s:
        default-tolerations:
          - key: 'flyte.org/node-role'
            operator: 'Equal'
            value: 'worker'
            effect: 'NoSchedule'