Hey community!
TLDR: can you delete FlyteExecutions/results via the flytekit/flyte-backend, and if not, can you at least invalidate their cached status?
We store our task results on S3. For data management, it is sometimes useful to delete these s3-based results from flyte executions (for failed jobs, for example, even though some tasks may have succeeded) - or just really old jobs, because we don't like our large s3 bill.
The problem is that if you delete the s3 result data for a task that wrote to the cache, a later task will still behave as if it receives the cached result, but it will be "0", and the task will fail.
It would be best if you could interact with flytekit and tell it to just delete an execution and all results (also invalidating any cache status). I think I wrote about this before, and was told this was intentionally not implemented because it seemed too complex, too dangerous, too many edge cases. But you can't expect people to keep data on s3 forever.
Should we just interact with the flyte backend database to delete entries directly? This feels a bit dangerous and brittle, prone to break when your schema changes, and I'd prefer an API to deal with this.
What is the best practice here?