I am trying to understand the flyte data lifecycle, I understand the differences between metadata and raw data. My questions/assumptions are:
• data lives permanently in the data stores, there is no cleanup. Is that correct?
• If we would implement a cleanup step, would something like s3 lifecycle policies be enough, or do we need to update/clean the entries in the database as well? Would this interfere with caching?
• is there a way to use multiple s3 stores e.g. different projects use different stores?
10/18/2023, 2:32 PM
1 no default cleanup
2 yes lifecycle policies are ok, do not delete metadata. Make lifecycle sufficiently large and set global cache ttl less than that (in propeller)
3 absolutely- per execution even. See raw-output-prefix (pyflyte run) or set it as the project configuration