Hi Team, We have being using flyte for past 1 year...
# flyte-support
h
Hi Team, We have being using flyte for past 1 year to train and build our model and what we have realised that its s3 memory consumption have grown extensively also this has the direct implication to the cost. So I would like to know is there a simpler way to manage data in s3 bucket. I have setup some lifecycle rules but we have to very careful with deleting the data from s3 as i know flyte store metadata ins s3 as well. As deleting some data might corrupt the flyte system. Have anyone of you find a solution on how can we do the flyte cleanup
a
hey Vipul Resource inventory and status is persisted to flyteadmin's database so you should be able to define a cutoff date and use lifecycle rules to delete content from the bucket. In case an execution looks for metadata (e.g. inputs/outputs) that was in the bucket, it should re generate it
for reference this is how it's handled at Union: https://www.union.ai/docs/byoc/deployment/data-retention-policy/
h
Thanks @average-finland-92144 for help and really appreciate your input. So what i am able to understand is if we keep s3 bucket object policy lifecycle as X number of days Flyte workflow should not be impacted,
a
correct