Viljem Skornik
04/27/2023, 7:28 PMSamhita Alla
Dan Rammer (hamersaw)
04/28/2023, 11:45 AMImagePullBackoff
? or something like task active timeout [%s] expired
?Viljem Skornik
04/28/2023, 3:58 PM[1/1] currentAttempt done. Last Error: USER::[1/1] currentAttempt done. Last Error: USER::containers with unready status: [primary]|Back-off pulling image ...
At this point this node has failed and the whole worfklow failed. But the actual pod will eventually start and do whatever it was supposed to do.Dan Rammer (hamersaw)
04/28/2023, 4:34 PMImagePullBackoff
error in this PR. Basically, if Flyte fails a task because of ImagePullBackoff
it will then delete the Pod. This was particularly troublesome when the ImagePullBackoff
was because the image did not exist - then the Pod would never start (and therefore never complete) and just stick around taking up resources.Viljem Skornik
04/28/2023, 4:39 PMDan Rammer (hamersaw)
04/28/2023, 4:42 PMViljem Skornik
04/28/2023, 4:43 PMDan Rammer (hamersaw)
04/28/2023, 4:46 PMViljem Skornik
04/28/2023, 4:47 PMSam Eckert
06/08/2023, 11:58 PMregistryPullQPS
and registryBurst
being set too low.