Hey everyone. We've had an issue recently due to a...
# flyte-support
b
Hey everyone. We've had an issue recently due to a large dropoff in availability of the main GPUs we were using on GCP, the L4s. We haven't been able to get any tasks that need them to start up, due to availability. I haven't been able to find a good way to change which machine type to request for a given task without re-registering the workflow. Are there any features that could help dynamically change machine type requests if a task isn't starting due to availability issues? Or what resources are there for trying to do something like that.
f
this does not exist as a feature - you wll have to re-register. Another option is to change the config at the back
just change the node selectors etc - from L4 to example T4
b
Ok, gotcha. Yeah, we're using node-pool selectors right now. Sounds like we'll probably just have to use smaller GPUs until L4s are more available
f
ya just switch in propeller
much easier
b
What does that look like? That's a different way from using pod templates?
f
hmm there is pod templates, but also there is a gpu switch
are you using accelerator?
b
I tried using them, and I remember I had issues getting it to work correctly, so I moved to using
node_selector={"<http://cloud.google.com/gke-nodepool|cloud.google.com/gke-nodepool>": "<node-pool-name"},
instead. I wish I remembered why it didn't work, I can't find my note on it right now