I think that s a great improvement My only concern would be Flyte #flyte-connectors

I think that’s a great improvement. My only concer...

dry-pizza-97077

09/30/2024, 8:47 AM

I think that’s a great improvement. My only concern would be around yaml sanity. As long as there is a configmap merge operation, one single team uploading a bad yaml can break all other teams’ config, so propeller should be given a good mechanism to prevent bad agent yamls from making it to the final configmap. Ideally propeller should use the latest known good config for every single agent, and one breaking config yaml for a given agent should be ignored and shouldn’t block other teams from updating some other agent config

freezing-airport-6809

09/30/2024, 2:32 PM

That’s hard to do, the merging happens at yaml layer

dry-pizza-97077

09/30/2024, 2:54 PM

I guess we can use webhooks or any other yaml validation mechanism before applying the resulting big yaml. I think separation of concerns and decentralised management should also come with blast radius isolation and automated validation to prevent one single agent from making the propellers unable to start

dry-pizza-97077

09/30/2024, 2:55 PM

and I remark automated, because even when config creation is decentralised if a single flyte owning team is to validate all configs this defeats the entire purpose of this idea

freezing-airport-6809

09/30/2024, 2:59 PM

There is a config validator in Flyte- using that would be a good idea

billowy-church-83438

10/01/2024, 6:50 AM

the merging happens at yaml layer

It does not need to be yaml per se right? propeller should be capable to discover the configmap objects in the cluster based on some convention name or labels.

billowy-church-83438

10/01/2024, 6:50 AM

Here are some potential details Iam thinking about on decentralization without compromising system stability… Discovery of Custom Agent ConfigMaps: • Implement a mechanism that systematically discovers and collects custom agent ConfigMaps across namespaces or predefined locations. This ensures that any new custom agent configurations are properly detected and processed. Validation of ConfigMaps: • Before applying or loading any custom agent configuration, each ConfigMap should go through an automated validation process (via webhooks or other validation tools). If a ConfigMap fails validation, it would be excluded from being applied or loaded, without affecting other valid configurations. • This ensures that invalid custom agent configurations do not propagate errors or make FlytePropeller unable to start. Aggregation and Conflict Resolution: • After validation, the valid ConfigMaps should be aggregated. This aggregation should handle potential *conflicts*—for example, if two agents have the same name or endpoint, a mechanism should resolve or flag the conflict. • This ensures that only conflict-free and valid configurations are applied, maintaining isolation between configurations and preventing a bad agent config from affecting the entire Flyte setup.

48 Views

Open in Slack

Previous Next