Hey community! TLDR: can you delete FlyteExecutio...
# flyte-support
m
Hey community! TLDR: can you delete FlyteExecutions/results via the flytekit/flyte-backend, and if not, can you at least invalidate their cached status? We store our task results on S3. For data management, it is sometimes useful to delete these s3-based results from flyte executions (for failed jobs, for example, even though some tasks may have succeeded) - or just really old jobs, because we don't like our large s3 bill. The problem is that if you delete the s3 result data for a task that wrote to the cache, a later task will still behave as if it receives the cached result, but it will be "0", and the task will fail. It would be best if you could interact with flytekit and tell it to just delete an execution and all results (also invalidating any cache status). I think I wrote about this before, and was told this was intentionally not implemented because it seemed too complex, too dangerous, too many edge cases. But you can't expect people to keep data on s3 forever. Should we just interact with the flyte backend database to delete entries directly? This feels a bit dangerous and brittle, prone to break when your schema changes, and I'd prefer an API to deal with this. What is the best practice here?
h
can you delete...
No. However, there is a setting to set the cache expiry on Propeller
max-cache-age
that, when set, will stop reading cache entries older than that value. I would set it to half the retention period value on the policy on s3...
m
Ok, thanks for this! It would still be nice to allow invalidation of cache on a per-task or per workflow-execution basis, but this is useful!