Is there a way to make Flyte use nodeAffinity inst...
# flyte-deployment
m
Is there a way to make Flyte use nodeAffinity instead of nodeSelector for spot nodes (when
interruptible=True
)? And if not, is this something that would be considered? My justification is this - currently with nodeSelector, it can easily make a task unschedulable, for example we do not run GPUs on Spot nodes so a GPU task with
interruptible=True
will never be schedulable. By using nodeAffinity instead of nodeSelector, this behaviour can be configured by the user with either
requiredDuringSchedulingIgnoredDuringExecution
or
preferredDuringSchedulingIgnoredDuringExecution
r
We wound up adding a layer of indirection between our Flyte workloads & scheduling with Kyverno (specifically mutate rules). The idea is you tag your workloads w/ a well-known annotation and then the policy can apply an arbitrary transformation on the pod spec if the annotation matches (we use this right now to have a level of indirection between model pipeline developers and decisions around what types of nodes we schedule on, on-demand vs spot, etc) It's certainly possible to use this to manage
nodeAffinity
as well.
The Kyverno helm chart is pretty straightforward to deploy as well.
m
Ah this is a good idea! We’re already using kyverno actually, but using mutations for this hadn’t crossed my mind!
r
Cool! Feel free to hit me up with questions, we're using kyverno for this sort of indirection between workloads and cloud provider/autoscaler implementation details pretty extensively
o
@Rahul Mehta Sorry for resurrecting this old thread, what kind of fields were you managed to use kyverno to mutate? We're getting a lot this in flytepropeller: "error syncing Forbidden: pod updates may not change fields other than
spec.containers[*].image
,
spec.initContainers[*].image
,
spec.activeDeadlineSeconds
, `spec.tolerations`" .
r
So, just last week I had to rip out the kyverno pieces exactly due to this finalizer issue.
o
shucks thanks
r
I switched to creating higher-order decorators that then resolved to a specific
PodTemplate
- so we still have the separation of concerns between pipeline authors and lower-level infra details / they're separated by codeowners in our monorepo, but unfortunately we couldn't fully handle this on the cluster side
That said,
PodTemplate
seems much more functional than the previous hand-rolled PodTask we'd written, and you can specify a full podspec properly there (including nodeAffinity terms, podAntiAffinity, etc)