S3 Storage: Separate Backends for Metadata and for User Data? Cheers everyone, We have the followi...

tall-furniture-2561

02/20/2023, 10:44 AM

S3 Storage: Separate Backends for Metadata and for User Data? Cheers everyone, We have the following problem: Our Workflow & Task authors want to read & write files stored in a S3 backend, ideally via the Flyte Files. However, we don't want to have Flyte Metadata in the same endpoint. Can I configure the Flyte Pods executing the Tasks to have access to the "user S3 backend", without Flyte trying to store all its metadata there as well? Details: Locally on their laptops, we configured the

~/.flyte/config.yaml

, providing user-backend values for

storage.connection.endpoint, access-key, secret-key

etc, and executing the workflow via

python myworkflow.py

works like a charm -

FlyteFile("<s3://my-bucket/my-file.csv>").download()

accesses the user-backend configured in the

config.yaml

and downloads the file. Now regarding the Flyte deployment, I'm a bit lost which component (Admin, Propeller, ...) to configure so that the Pods have access to the s3-user-backend, without automatically trying to write their metadata (e.g.,

<s3://flyte/metadata/propeller/myproject-development-ffba3b040e85d4801a9c/n1/data/0>

thankful-minister-83577

02/20/2023, 3:49 PM

btw in the newer versions, that storage config section is no longer necessary in the flytectl config

thankful-minister-83577

02/20/2023, 3:49 PM

storage is determined by backend settings.

thankful-minister-83577

02/20/2023, 3:49 PM

which helm chart are you on btw?

thankful-minister-83577

02/20/2023, 3:56 PM

if you’re on the newer helm chart (for the

flyte-binary

chart) you can use these two values to configure the default buckets. they get used by the chart in the propeller config.

thankful-minister-83577

02/20/2023, 3:59 PM

in the `flyte`/`flyte-core` helm chart, that is specified here and used similarly here

thankful-minister-83577

02/20/2023, 3:59 PM

keep in mind these are default values.

thankful-minister-83577

02/20/2023, 3:59 PM

you can override these by project

thankful-minister-83577

02/20/2023, 3:59 PM

or by project & domain

thankful-minister-83577

02/20/2023, 3:59 PM

or by project & domain & workflow

191 Views

Open in Slack

Previous Next

Flyte

Flyte enables production-grade orchestration for machine learning workflows and data processing created to accelerate local workflows to production.