Hi everyone I m exploring the installation of Flyte in a mul Flyte #flyte-support

Hi everyone, I'm exploring the installation of Fl...

wonderful-accountant-50820

09/27/2023, 2:40 PM

Hi everyone, I'm exploring the installation of Flyte in a multi-cluster setup with a specific requirement: one execution cluster in the cloud and another on-premises. The cloud cluster will run both the control and data planes, while the on-premises cluster will run only the data plane (flytepropeller). Ideally, the on-prem setup would utilize MinIO, and the cloud data plane would use S3. My understanding is that FlyteAdmin generates presigned URLs for client data uploads. Is it possible to configure FlyteAdmin to direct source code distribution(tar.gz files) and other uploads to the on-premises MinIO when scheduling a workflow there? Currently, if I schedule a workflow on the on-prem execution cluster, it's unable to pull data from S3 because it's configured to use MinIO for local object storage and is missing the AWS credentials. I am using the latest

flyte-core

helm chart that I adjusted to my needs by following Multiple k8s Deployment docs

👍 1

thankful-minister-83577

09/27/2023, 4:13 PM

you can’t have both… s3 and minio. at least not without some work.

thankful-minister-83577

09/27/2023, 4:13 PM

you can have different buckets, but not different endpoints

thankful-minister-83577

09/27/2023, 4:14 PM

can you use minio for everything?

wonderful-accountant-50820

09/27/2023, 4:53 PM

I could, I was just researching if this is possible. The problem is that if I go with the shared s3 object storage, executions on the on-prem cluster(because of it's limited network bandwith) will be slower compared to the same solution using minio. Is there a way to keep task outputs locally on the same execution cluster, or potentially in memory, and pass it to the next task?

wonderful-accountant-50820

09/27/2023, 4:55 PM

I might just go with two separate installations of control plane, one for on prem and another for cloud

thankful-minister-83577

09/27/2023, 6:58 PM

wait i thought on prem == minio

wonderful-accountant-50820

09/27/2023, 7:09 PM

That's right. Minio on prem, s3 for cloud. I am saying that if I use s3 as object storage for on prem, it will be slower

thankful-minister-83577

09/27/2023, 7:36 PM

not really… not the primitive data at least. you understand that distinction right?

thankful-minister-83577

09/27/2023, 7:37 PM

primitive i/o like floats and strings are what we call metadata, they go into the metadata bucket (along with other things like offloaded objects admin uses).

thankful-minister-83577

09/27/2023, 7:38 PM

and off-loaded data types like files and dataframes go into another bucket.

thankful-minister-83577

09/27/2023, 7:38 PM

(think of this as stack and heap kinda)

thankful-minister-83577

09/27/2023, 7:40 PM

so there’s a natural built-in distinction between the two buckets but not endpoints. user containers need access to both buckets. flyte itself only needs access to the metadata one.

wonderful-accountant-50820

09/27/2023, 7:47 PM

Got it, thank you 🙏

2 Views

Open in Slack

Previous Next