<@U06RKAMK0UC> Deployment of mutlicluster flyte sc...
# ask-ai
m
@ancient-wolf-19325 Deployment of mutlicluster flyte scenario, Ive cluster A with admin and cluster B with propeller. Both have separate S3 storages defined in helmcharts. Why does cluster B needs access to S3 from cluster A? Is there a way to avoid it?
a
Reasoning Complete! šŸ¤– šŸ” Searched knowledge base: I'll look up how Flyte handles storage in multi-cluster setups to explain why cluster B might need access to cluster A's S3, and whether it's possible to avoid this requirement. • Searching over 10 data sources from the following: File, Github File, Github Issue, Web • Found 61 documents that could be relevant from the following data sources: Github File, Github Issue, Github Pull Request, Web Content. • Filtered down to the top 30 most relevant documents. āœļø Generated answer: I have enough information to explain why cluster B needs access to cluster A's S3 and why it's difficult to avoid this requirement in Flyte's current architecture.
Cluster B (Propeller) needs access to S3 from cluster A because Flyte stores workflow metadata (including references to raw data) in the control plane's S3 bucket. When Propeller in cluster B executes tasks, it reads input/output data paths from metadata written by Admin (cluster A), which point to S3 bucket A. This means Propeller must access S3 A to fetch or store workflow data, even if it has its own S3 configured locally. There is no documented way to avoid this cross-bucket access, as Flyte's architecture assumes a shared object store for workflow data across clusters. Using separate S3 buckets for each cluster without cross-access will result in errors or missing data during execution and UI inspection, as seen in this issue. To avoid this, you must either (1) ensure all clusters use the same S3 bucket for Flyte data, or (2) manually update all metadata references in the Flyte database to point to the new bucket after migration, which is not automated and can be error-prone. There is no built-in support for fully isolated S3 buckets per cluster in a multicluster Flyte deployment according to the available documentation and issues. Would you like more detail on possible migration or workaround strategies? Sources: - FlyteAdmin Continues Reading from Old S3 Bucket After Migration, Causing 500 Errors in UI šŸ‘ 0 button šŸ‘Ž 0 button šŸ’” Hint: Mention @runllm in the thread for followups.
m
@ancient-wolf-19325 if my clusters are in separate regions, does most of the data end up in S3 in control plane or in S3 in data plane? can you point what exactly is stored in which bucket? Things like code executed, input and output data (form workflow and from individual tasks).
a
Reasoning Complete! šŸ¤– šŸ” Searched knowledge base: I'll look up Flyte's documentation to find out exactly what types of data are stored in the control plane's S3 bucket versus the data plane's S3 bucket, so I can give you a clear breakdown. • Searching over 10 data sources from the following: File, Github File, Github Issue, Web • Found 50 documents that could be relevant from the following data sources: Github File, Github Issue, Github Pull Request, Web Content. • Filtered down to the top 30 most relevant documents. āœļø Generated answer: I have enough information to clearly explain which data ends up in the control plane's S3 bucket versus the data plane's, and what types of files are stored where.
It seems likely that most workflow data (inputs, outputs, artifacts) is stored in the S3 bucket configured for the control plane, not the data plane. Flyte stores metadata (including references to raw data) in the control plane's S3, and task input/output data paths are written there. The code executed (workflow definitions, task specs) and metadata are also stored in the control plane's S3. When a task runs in the data plane, it reads/writes data to the S3 bucket referenced in the metadata, which is typically the control plane's S3, unless you have customized the output prefix per execution (which is not the default and requires careful configuration). The data plane's S3 is not used unless specifically configured for raw data output, but this is not standard and can cause issues with data accessibility and UI errors if not coordinated. Specifically: - Code executed (workflow/task definitions): stored in control plane S3. - Input/output data (workflow and tasks): by default, stored in control plane S3, referenced by metadata. - Metadata (references, small values): stored in control plane S3. - Data plane S3 is only used if you explicitly set output prefixes to point there, but this is not the default and can break cross-cluster access and UI features if not handled carefully. Would you like more detail on how to configure per-execution output prefixes or on Flyte's data flow? Sources: - https://flyte.org/blog/speed-up-time-to-production-with-flyte-on-eks-from-automation-to-multicloud#starCount šŸ‘ 0 button šŸ‘Ž 0 button šŸ’” Hint: Mention @runllm in the thread for followups.
m
@ancient-wolf-19325 how to configure data plane to use its own bucket then? And AFAIU even if I configure it I still need to give propeller access to S3 on control plane, correct?
@ancient-wolf-19325 how to configure data plane to use its own bucket then? And AFAIU even if I configure it I still need to give propeller access to S3 on control plane, correct?
a
Reasoning Complete! šŸ¤– šŸ” Searched knowledge base: I'll look up how to configure Flyte so that the data plane (Propeller) uses its own S3 bucket, and whether Propeller still needs access to the control plane's S3 after this change. • Searching over 10 data sources from the following: File, Github File, Github Issue, Web • Found 49 documents that could be relevant from the following data sources: Github File, Github Issue, Github Pull Request, Web Content. • Filtered down to the top 30 most relevant documents. āœļø Generated answer: I have clear information on how to configure the data plane to use its own S3 bucket and can confirm whether Propeller still needs access to the control plane's S3. I'm ready to provide a complete answer.
To configure the data plane (Propeller) to use its own S3 bucket, set the
rawoutput-prefix
in the Propeller configuration to point to the desired bucket (e.g.,
rawoutput-prefix: s3://<data-plane-bucket>/
). This can be done in the Helm chart or ConfigMap for Propeller. However, even with this configuration, Propeller still needs access to the control plane's S3 bucket because workflow metadata, code, and some input references are stored there by FlyteAdmin. Propeller must read these to execute workflows and tasks, as confirmed in this issue. So yes, you must grant Propeller access to both its own S3 bucket (for outputs) and the control plane's S3 bucket (for metadata and workflow artifacts). There is no supported way to fully isolate S3 access per cluster without breaking Flyte's data flow and UI features. Sources: - FlyteAdmin Continues Reading from Old S3 Bucket After Migration, Causing 500 Errors in UI - Flyte Helm chart values: rawoutput-prefix šŸ‘ 0 button šŸ‘Ž 0 button šŸ’” Hint: Mention @runllm in the thread for followups.
m
@ancient-wolf-19325 in official documentation this rawoutput-prefix seems to be :"s3://{{ .Values.storage.bucketName }}. So AFAIU if I define storage.bucketName it will be used by default to keep the user-data in propeller-defined storage?
a
Reasoning Complete! šŸ¤– šŸ” Searched knowledge base: I'll look up the official Flyte documentation and Helm chart references to confirm how setting
storage.bucketName
affects the
rawoutput-prefix
and whether it controls where user data is stored by default. • Searching over 10 data sources from the following: File, Github File, Github Issue, Web • Found 51 documents that could be relevant from the following data sources: Github File, Github Issue, Github Pull Request. • Filtered down to the top 30 most relevant documents. āœļø Generated answer: I have clear information from the documentation that setting
storage.bucketName
in your Helm values will configure Propeller to use that S3 bucket for user data by default.
Yes, that's correct. In the official Helm charts,
rawoutput-prefix
defaults to
s3://{{ .Values.storage.bucketName }}/
, so if you set
storage.bucketName
, both metadata and user data (raw outputs) will be stored in that bucket by default unless you override
rawoutput-prefix
separately. This means Propeller will use the bucket defined in
storage.bucketName
for user data storage unless explicitly configured otherwise. See flyte-core values.yaml and PR #6433. Would you like more detail on customizing these settings? Sources: - flyte-core values.yaml - flyte-core: Default propeller `rawoutput-prefix` to use `storage.bucketName` šŸ‘ 0 button šŸ‘Ž 0 button šŸ’” Hint: Mention @runllm in the thread for followups.
m
@ancient-wolf-19325 - having configured - development: - projectQuotaCpu: value: "4" - projectQuotaMemory: value: "3000Mi" - defaultIamRole: value: "arnawsiam::{{ .Values.userSettings.accountNumber }}:role/{{ .Values.userSettings.workerRole }}" in both cluster with propeller and admin. Which role will be assumed by the pod? Is it the role from admin or is it the role from data plane deployment?
a
Reasoning Complete! šŸ¤– šŸ” Searched knowledge base: I'll look up Flyte's documentation to find out whether the pod uses the IAM role from the admin (control plane) or the data plane (propeller) cluster when both are configured with the same project settings. • Searching over 10 data sources from the following: File, Github File, Github Issue, Web • Found 53 documents that could be relevant from the following data sources: Github File, Github Issue, Github Pull Request, Web Content. • Filtered down to the top 30 most relevant documents. āœļø Generated answer: I have clear information that the pod will assume the IAM role configured in the data plane (propeller) cluster, not the admin (control plane) cluster.
The IAM role assumed by the pod is the one configured in the data plane (Propeller) cluster, not the control plane (Admin) cluster. When a workflow task is executed, the pod is created in the data plane's Kubernetes cluster, and it uses the
defaultIamRole
annotation from the data plane's namespace/service account configuration. The control plane's IAM role is not used for pod execution. This is confirmed by the Flyte deployment and configuration logic, which applies the IAM role annotation to the service account in the data plane where the pod actually runs (source). Would you like more detail on how IAM roles are mapped to service accounts in Flyte? Sources: - Flyte on EKS: IAM Roles for Service Accounts šŸ‘ 0 button šŸ‘Ž 0 button šŸ’” Hint: Mention @runllm in the thread for followups.
m
@ancient-wolf-19325 Either data plane is using role defined in control plane, or at least it is using templated values like
defaultIamRole
from control plane? Because what I am seeing is clearly the worker pod uses identity defined in control plane, not the one defined in data plane
a
Reasoning Complete! šŸ¤– šŸ” Searched knowledge base: I'll look up Flyte's documentation and known issues to see if there are cases where the data plane ends up using the IAM role or templated values from the control plane, and why this might happen. • Searching over 10 data sources from the following: File, Github File, Github Issue, Web • Found 48 documents that could be relevant from the following data sources: Github File, Github Issue, Github Pull Request, Web Content. • Filtered down to the top 30 most relevant documents. āœļø Generated answer: I have clear information that, by default, the data plane uses the IAM role or templated values (like
defaultIamRole
) defined in the control plane, which matches what you're observing. I'm ready to provide a complete answer.
You are correct that, in Flyte's default multi-cluster setup, the
defaultIamRole
and other resource templates are typically managed and distributed from the control plane (Admin) to the data plane (Propeller) clusters. The control plane's configuration (including templated values like
defaultIamRole
) is used to create or update the Kubernetes resources (such as service accounts with IAM role annotations) in the data plane via the cluster resource sync mechanism. As a result, the worker pod in the data plane may end up using the IAM role defined in the control plane's configuration, not a value set independently in the data plane. This behavior is by design: the control plane's
cluster_resources.yaml
(or equivalent Helm values) is the source of truth for project/domain resource quotas and IAM roles, and these are propagated to all data planes. If you want different IAM roles per data plane, you must customize the control plane's configuration to specify different roles for each cluster, which is not supported out-of-the-box and would require manual intervention or custom automation (source). Would you like more detail on how to customize IAM roles per data plane or how the cluster resource sync works? Sources: - flyte-core values-eks.yaml: cluster_resources and defaultIamRole šŸ‘ 0 button šŸ‘Ž 0 button šŸ’” Hint: Mention @runllm in the thread for followups.
m
@ancient-wolf-19325 but then worker pod still needs to access S3 managed by propeller? Or only this managed by admin
a
Reasoning Complete! šŸ¤– šŸ” Searched knowledge base: I'll look up Flyte's documentation to clarify whether the worker pod in the data plane ever needs to access the S3 bucket managed by Propeller, or if it only needs access to the S3 bucket managed by Admin. • Searching over 10 data sources from the following: File, Github File, Github Issue, Web • Found 53 documents that could be relevant from the following data sources: Github File, Github Issue, Github Pull Request, Web Content. • Filtered down to the top 30 most relevant documents. āœļø Generated answer: I have enough information to clearly explain that worker pods in the data plane usually only need access to the S3 bucket managed by the control plane (admin), not a separate S3 bucket managed by the data plane (propeller).
It seems likely that the worker pod (in the data plane) primarily needs access to the S3 bucket managed by the control plane (Admin), since workflow metadata and raw data paths are written there and referenced during execution. Even if the IAM role is defined by the control plane, the pod must access the S3 bucket where FlyteAdmin stores workflow data. The S3 bucket managed by Propeller (data plane) is only used if you explicitly configure output prefixes to point there, which is not the default. So, by default, the worker pod needs access to the S3 bucket managed by Admin, not the one managed by Propeller, unless you have customized the storage configuration. Would you like more detail on how to configure output prefixes for data plane buckets? Sources: - https://flyte.org/blog/speed-up-time-to-production-with-flyte-on-eks-from-automation-to-multicloud#starCount - https://github.com/flyteorg/flyte/pull/3994 šŸ‘ 0 button šŸ‘Ž 0 button šŸ’” Hint: Mention @runllm in the thread for followups.