Hey guys,
Following deployment guide I got to a point with a proper cluster and flyte console running but the nodes (using the guide) includes flyte.org/node-role=workers toleration but the pods scheduled from flyte doesn’t… how can I fix it to make the flyte pods to include that toleration?
@hallowed-autumn-63270 I'm facing the same issue (job requesting gpu do not trigger autoscaling with message such as:
pod didn't trigger scale-up: 3 node(s) didn't match Pod's node affinity/selector, 3 node(s) had taint {<http://flyte.org/node-role|flyte.org/node-role>: worker}, that the pod didn't tolerate
) , and I'm trying to figure out where I should put this additional config.
Is it in the helm chart values ?
Did you use the syntax suggested in this doc page (which uses
resources-toleration
instead of
default-tolerations
) ?
It would be awesome if you could share a bigger yaml snippet 🙂