curved-petabyte-84246
12/08/2024, 2:06 PMaverage-finland-92144
12/09/2024, 3:13 PM1. It means I cannot run more than 1 replica.That the depends on the topologyKey used. In this case is at the node level, so this is true per worker node, but not at the cluster level where the scheduler will spread out the Deployment replicas
average-finland-92144
12/09/2024, 3:20 PM1. Upgrade doesn't work! because the new pod won't be scheduled before the existing pod is terminated... so it's a dead lock.This can be also the case, especially depending on cluster size and available resources. The flyteadmin Deployment, for example, doesn't set a rollout
strategy
so it uses the default parameters (MaxSurge: 25%
, MaxUnavailable: 1
). Maybe making this configurable would help?curved-petabyte-84246
12/10/2024, 6:44 AMaverage-finland-92144
12/10/2024, 10:22 AMJust for the sake of spreading the instances, or is there another reason?It's about resiliency yes, there's not really and admin-specific conflict on having them colocated
average-finland-92144
12/10/2024, 12:02 PMIf you have a single k8s node (which isn't that far-fetched), you can only upgrade (given the anti-affinity) if MaxUnavailable==0, right? which isn't ideal.I think maxUnavailable: 0 would block you even further. If you have single k8s node, you'd be better off with flyte-binary, which uses a much simpler `recreate`strategy, with the drawback of causing temporary downtime during upgrades
curved-petabyte-84246
12/10/2024, 7:37 PMaverage-finland-92144
12/30/2024, 1:37 PM