Flyte enables production-grade orchestration for machine learning workflows and data processing created to accelerate local workflows to production.

Flyte

<https://github.com/flyteorg/flyteplugins/pull/354|#354 Improve demystifying GKE spot node preemtion #patch>
Pull request opened by <https://github.com/bstadlbauer|bstadlbauer>
*TL;DR*

Fixes a bug where propeller would incorrectly label spot node preemption as user error

*Type*

☑︎ Bug Fix
☐ Feature
☐ Plugin

*Are all requirements met?*

☐ Code completed
☐ Smoke tested
☑︎ Unit tests added
☐ Code documentation added
☐ Any pending items have an associated Issue

*Complete description*

We've had a lot of issues recently where tasks running on spot instances would not be re-scheduled onto regular instances after they've been preempted by GKE.

We've been able to consistently replicate this by:

1. Starting an interruptible task with some retries
2. As soon as it's up, going to the VM instances page and Stopping (not deleting) the instance. According to the <https://cloud.google.com/compute/docs/instances/spot#preemption-process|Google Cloud docs> this is the same as preemption

Doing this, non of the retry attempts have been scheduled onto non-spot instances.

I've debugged this by running a local instance of `flytepropeller` with a local version of `flyteplugins`. I've used this instance instead of the one usually running in our cluster. I then set a breakpoint here:

<https://github.com/flyteorg/flyteplugins/blob/8a2f8ca2e723d067c4915b8a9ec2960eb4ff6526/go/tasks/pluginmachinery/flytek8s/pod_helper.go#L625|flyteplugins/go/tasks/pluginmachinery/flytek8s/pod_helper.go>

Line 625 in &lt;/flyteorg/flyteplugins/commit/8a2f8ca2e723d067c4915b8a9ec2960eb4ff6526|8a2f8ca&gt;

And could see that the code is `"Terminated"` instead of `"Shutdown"`. Once I've added `"Terminated"` things worked as expected
<https://github.com/flyteorg/flyteplugins|flyteorg/flyteplugins>
:white_check_mark: All checks have passed
7/7 successful checks

<https://github.com/flyteorg/flyteplugins/pull/354|#354 Improve demystifying GKE spot node preemtion #patch>
Pull request merged by <https://github.com/hamersaw|hamersaw>
<https://github.com/flyteorg/flyteplugins|flyteorg/flyteplugins>