Hi all. I’m Stan from Expedia, new to this channel 🙂 nice to be here!
We’ve been looking into extending Flyte to support cluster failover and we’d like to explore options to contribute back to open source.
The rough idea is to define failover rules - in the label-cluster-map server-side, and in ECLs client-side. FlyteAdmin would check cluster health/availability prior to workflow submission, and do a failover to a different cluster (label) if the given one isn’t live.
Would there be interest in collaborating on such a contribution? We can give more details about the ideas we’ve had so far, if necessary.
cc. @adventurous-ability-21671 @elegant-nest-18216