Hello, My flyte task seems to run forever. How do I troubleshoot this? Thanks!
hi I had a similar issue,
kubectl describe pods
told me the reason
I had this issue
  Type     Reason   Age                  From     Message
  ----     ------   ----                 ----     -------
  Warning  Failed   15m (x14 over 63m)   kubelet  (combined from similar events): Failed to pull image "<http://ghcr.io/flyteorg/flytekit:py3.9-latest|ghcr.io/flyteorg/flytekit:py3.9-latest>": rpc error: code = Unknown desc = failed to pull and unpack image "<http://ghcr.io/flyteorg/flytekit:py3.9-latest|ghcr.io/flyteorg/flytekit:py3.9-latest>": failed to copy: httpReadSeeker: failed open: failed to do request: Get "<https://ghcr.io/v2/flyteorg/flytekit/blobs/sha256:bc2f41c411c6383adc7eda6155671cc80599548541558279f5bec4500428f413>": EOF
  Normal   BackOff  10m (x294 over 80m)  kubelet  Back-off pulling image "<http://ghcr.io/flyteorg/flytekit:py3.9-latest|ghcr.io/flyteorg/flytekit:py3.9-latest>"
Hmm failed to pull, can you try pulling yourself
I actually did that
and it resolved the issue
docker pull <http://ghcr.io/flyteorg/flytekit:py3.9-latest|ghcr.io/flyteorg/flytekit:py3.9-latest>
I'm not sure you are facing the same issue, but worth checking
I think I have a different reason. It looks like I managed to pull the image successfully.
@Albert Wibowo can you tell me more. Something seems wrong. Is Flyte binary running? I guess your UI is being served. It seems like the engine was disabled for some reason. We might have to logs of thr Flyte binary
Yes the UI is being served. How do I check if Flyte binary is running?
Thank you @Robert Ambrus! @Albert Wibowo as Robert said, you'd need to:
kubectl get po -n flyte
is the namespace where you deployed Flyte) and check the status, it should be
1/1 Running
Otherwise, check logs
kubectl logs <flyte-pod-name>-n flyte
Yup just did that. The result looks okay to me.
ok, what about the logs?
I don't see any error in any of the logs. Hmm this is weird.
So this is the result from running
kubectl describe pods
. Is there anything weird @David Espejo (he/him) @Ketan (kumare3)?
@Albert Wibowo could you
kubectl describe
one of your worker nodes? I've seen this behavior when there's insufficient CPU cores. Also, are you requesting specific resources in your task?
ah ok that could be it actually. Let me try it.
Hello @David Espejo (he/him), I dont think that's the case. Could it be that flyte demo env only works if I use docker desktop? I am actually using docker engine + colima instead of docker desktop due to licensing. I tried flyte demo w/ docker desktop and it worked. But when I used the same settings on colima e.g. the exact number of CPU cores, the same workflow did not work🤔. Do I have to manually set some config or something?