https://flyte.org logo
#ask-the-community
Title
# ask-the-community
v

Vinayak

04/01/2024, 7:29 PM
Hi team, we are trying setup a multicluster setup of flyte and we are wondering if it was okay to setup an instance of datacatalog per cluster? Would there be any concerns, apart from some cache not being shared between cluster?
p

Paul Dittamo

04/01/2024, 7:42 PM
> Would there be any concerns, apart from some cache not being shared between cluster? Retrieving stale data on cache hits would be the biggest concern and cache misses to a lesser extent. Would a given workflow be running in multiple clusters? If you were able to have each workflow run in a dedicated cluster that could avoid those issues. I don't believe setting up a shared datacatalog instance across multiple clusters is supported out of the box in open source.
v

Vinayak

04/01/2024, 7:47 PM
ah i see. so currently it is expected to have one datacatalog/cluster? the multicluster setup documentation didn't specify anything about setting up a datacatalog per cluster, so i assumed it was shared
p

Paul Dittamo

04/01/2024, 7:50 PM
wait apologies for the confusion/my mistake. In a multicluster setup you're referring to utilizing a single control plane with multiple dataplanes (flytepropeller) right?
v

Vinayak

04/01/2024, 7:50 PM
yes
p

Paul Dittamo

04/01/2024, 7:57 PM
datacatalog is a stateless wrapper over the same postgres instance that stores state for executions of which the control plane uses as a source of truth. All the datacatalog instances would point to the same database so there isn't a concern for weird cache behavior. Let me look into the multi-cluster deployment really quickly to confirm some things.
v

Vinayak

04/01/2024, 7:58 PM
thanks a lot for the input! we were wondering the same thing and just wanted to make sure we don't miss anything
p

Paul Dittamo

04/01/2024, 8:15 PM
if it was okay to setup an instance of datacatalog per cluster?
you shouldn't need to spin up a datacatalog instance per cluster as it's separate from execution/propeller. Datacatalog is a part of the control plane similar to Flyteadmin. Setting up datacatalog with replicas would probably be a better way to approach/view this.
v

Vinayak

04/04/2024, 8:49 PM
Thanks a lot for the help! Let me take a look
2 Views