https://flyte.org logo
Title
f

Fabian Rabe

02/20/2023, 10:44 AM
S3 Storage: Separate Backends for Metadata and for User Data? Cheers everyone, We have the following problem: Our Workflow & Task authors want to read & write files stored in a S3 backend, ideally via the Flyte Files. However, we don't want to have Flyte Metadata in the same endpoint. Can I configure the Flyte Pods executing the Tasks to have access to the "user S3 backend", without Flyte trying to store all its metadata there as well? Details: Locally on their laptops, we configured the
~/.flyte/config.yaml
, providing user-backend values for
storage.connection.endpoint, access-key, secret-key
etc, and executing the workflow via
python myworkflow.py
works like a charm -
FlyteFile("<s3://my-bucket/my-file.csv>").download()
accesses the user-backend configured in the
config.yaml
and downloads the file. Now regarding the Flyte deployment, I'm a bit lost which component (Admin, Propeller, ...) to configure so that the Pods have access to the s3-user-backend, without automatically trying to write their metadata (e.g.,
<s3://flyte/metadata/propeller/myproject-development-ffba3b040e85d4801a9c/n1/data/0>
)?
y

Yee

02/20/2023, 3:49 PM
btw in the newer versions, that storage config section is no longer necessary in the flytectl config
storage is determined by backend settings.
which helm chart are you on btw?
if you’re on the newer helm chart (for the
flyte-binary
chart) you can use these two values to configure the default buckets. they get used by the chart in the propeller config.
in the `flyte`/`flyte-core` helm chart, that is specified here and used similarly here
keep in mind these are default values.
you can override these by project
or by project & domain
or by project & domain & workflow