https://flyte.org logo
#ask-the-community
Title
# ask-the-community
a

Albert Wibowo

06/12/2023, 9:07 AM
Hello, My flyte task seems to run forever. How do I troubleshoot this? Thanks!
r

Robert Ambrus

06/12/2023, 12:55 PM
hi I had a similar issue,
kubectl describe pods
told me the reason
I had this issue
Copy code
Events:
  Type     Reason   Age                  From     Message
  ----     ------   ----                 ----     -------
  Warning  Failed   15m (x14 over 63m)   kubelet  (combined from similar events): Failed to pull image "<http://ghcr.io/flyteorg/flytekit:py3.9-latest|ghcr.io/flyteorg/flytekit:py3.9-latest>": rpc error: code = Unknown desc = failed to pull and unpack image "<http://ghcr.io/flyteorg/flytekit:py3.9-latest|ghcr.io/flyteorg/flytekit:py3.9-latest>": failed to copy: httpReadSeeker: failed open: failed to do request: Get "<https://ghcr.io/v2/flyteorg/flytekit/blobs/sha256:bc2f41c411c6383adc7eda6155671cc80599548541558279f5bec4500428f413>": EOF
  Normal   BackOff  10m (x294 over 80m)  kubelet  Back-off pulling image "<http://ghcr.io/flyteorg/flytekit:py3.9-latest|ghcr.io/flyteorg/flytekit:py3.9-latest>"
k

Ketan (kumare3)

06/12/2023, 1:23 PM
Hmm failed to pull, can you try pulling yourself
r

Robert Ambrus

06/12/2023, 1:23 PM
I actually did that
and it resolved the issue
docker pull <http://ghcr.io/flyteorg/flytekit:py3.9-latest|ghcr.io/flyteorg/flytekit:py3.9-latest>
I'm not sure you are facing the same issue, but worth checking
a

Albert Wibowo

06/12/2023, 1:38 PM
I think I have a different reason. It looks like I managed to pull the image successfully.
k

Ketan (kumare3)

06/12/2023, 1:44 PM
@Albert Wibowo can you tell me more. Something seems wrong. Is Flyte binary running? I guess your UI is being served. It seems like the engine was disabled for some reason. We might have to logs of thr Flyte binary
a

Albert Wibowo

06/12/2023, 1:45 PM
Yes the UI is being served. How do I check if Flyte binary is running?
d

David Espejo (he/him)

06/12/2023, 1:47 PM
Thank you @Robert Ambrus! @Albert Wibowo as Robert said, you'd need to:
kubectl get po -n flyte
(assuming
flyte
is the namespace where you deployed Flyte) and check the status, it should be
1/1 Running
Otherwise, check logs
kubectl logs <flyte-pod-name>-n flyte
a

Albert Wibowo

06/12/2023, 1:49 PM
Yup just did that. The result looks okay to me.
d

David Espejo (he/him)

06/12/2023, 2:33 PM
ok, what about the logs?
a

Albert Wibowo

06/12/2023, 2:49 PM
I don't see any error in any of the logs. Hmm this is weird.
So this is the result from running
kubectl describe pods
. Is there anything weird @David Espejo (he/him) @Ketan (kumare3)?
d

David Espejo (he/him)

06/13/2023, 11:49 AM
@Albert Wibowo could you
kubectl describe
one of your worker nodes? I've seen this behavior when there's insufficient CPU cores. Also, are you requesting specific resources in your task?
a

Albert Wibowo

06/13/2023, 1:28 PM
ah ok that could be it actually. Let me try it.
Hello @David Espejo (he/him), I dont think that's the case. Could it be that flyte demo env only works if I use docker desktop? I am actually using docker engine + colima instead of docker desktop due to licensing. I tried flyte demo w/ docker desktop and it worked. But when I used the same settings on colima e.g. the exact number of CPU cores, the same workflow did not work🤔. Do I have to manually set some config or something?
2 Views