acoustic-carpenter-78188
01/25/2023, 1:27 PMPods marked as a retryable failure are never cleaned up. This PR only fast-fails when aborting tasks in a success phase, to ensure that all resources are correctly cleaned up at the risk of unnecessary API calls (ex. object does not exist, etc).
Type
☑︎ Bug Fix
☐ Feature
☐ Plugin
Are all requirements met?
☑︎ Code completed
☑︎ Smoke tested
☐ Unit tests added
☐ Code documentation added
☐ Any pending items have an associated Issue
Complete description
Currently, the TaskHandler fast-fails when aborting resources in a terminal phase. This causes issues where resources fail to cleanup when the task is reported as failed (ie. retryable failure or permanent failure) but the resource remains running.
One such scenario is handling Pods in the "ImagePullBackoff" state. We mark these as failed, but because of the fast-fail during abort these k8s resources are never deleted.
This may occur within other plugins as well. To be safe, we may want to attempt to abort all task phases other than Success.
Tracking Issue
fixes flyteorg/flyte#3239
Follow-up issue
NA
flyteorg/flytepropeller
GitHub Actions: Build & Push Flytepropeller Image
GitHub Actions: Goreleaser
GitHub Actions: Bump Version
✅ 11 other checks have passed
11/14 successful checks