New RunLLM response generated:
>>>*# Question*
Hi Flyte community, we are moving our installation to a
Multi cluster setup and utilising
GPUs. In this case, not all clusters may be built equally for a variety of reasons. How would Flyte work in these instances:
1. Is Flyte able to discern which cluster a workflow should be executed on without explicit configuration? Ie, only 1 of n clusters has GPUs enabled or available.
2. Is it possible to configure a preference for cluster for a specific resource? Ie, there are GPUs available on multiple clusters but there is a preference for them to be scheduled on a specific cluster, but if that cluster fails or is unavailable then it can be scheduled on another cluster.
# Answer
1. Flyte can discern which cluster a workflow should be executed on using execution cluster labels. These labels can be set to direct specific workflows or tasks to clusters with the required resources, such as GPUs. If no label matches, a cluster is randomly chosen unless a default execution label is configured (
source).
2. You can configure a preference for a specific cluster by setting execution cluster labels with weights. This allows you to specify a preferred cluster for certain resources, and if that cluster is unavailable, the workload can be scheduled on another cluster (
source).
Would you like more detail on how to implement these configurations?
Sources:
-
GitHub Issue #2882
Slack Link
https://.slack.com/archives/CP2HDHKE1/p1729161807.411369 👍 0 button 👎 0 button 💡
Hint: Mention @runllm in the thread for followups.