https://flyte.org logo
#ask-the-community
Title
# ask-the-community
s

Samuel Bentley

04/14/2023, 12:43 PM
Hi all I've managed to get a local demo cluster working on my old laptop with workflows running etc, but I've followed the exact same steps on my new one and I'm getting the following error whenever I run a workflow. Note, I've tried the cookbook hello_world example and the greeting workflow, but both give the same error. I think I'm doing something stupid, but I just can't figure it out.
containers with unready status: [f0de017c01c7140d8a4c-n0-0]|Back-off pulling image "<http://cr.flyte.org/flyteorg/flytekit:py3.9-1.5.0|cr.flyte.org/flyteorg/flytekit:py3.9-1.5.0>"
Any ideas?
k

Ketan (kumare3)

04/14/2023, 1:11 PM
Interesting, can you try docker pull cr.flyte.org/flyteorg/flytekit:py3.9-1.5.0
It could be network or, missing image?
As you see silently flytekit was upgraded
Cc @Samhita Alla can you please wuickly try
s

Samhita Alla

04/14/2023, 1:17 PM
Works for me.
k

Ketan (kumare3)

04/14/2023, 1:28 PM
@Samuel Bentley seems like a proxy or something else
s

Samuel Bentley

04/14/2023, 1:30 PM
So docker pull works fine, but in the container logs I see this
2023-04-14 14:27:17 E0414 13:27:17.174653      73 kuberuntime_image.go:51] "Failed to pull image" err="rpc error: code = Unknown desc = failed to pull and unpack image \"<http://ghcr.io/flyteorg/flytekit:py3.9-latest|ghcr.io/flyteorg/flytekit:py3.9-latest>\": failed to resolve reference \"<http://ghcr.io/flyteorg/flytekit:py3.9-latest|ghcr.io/flyteorg/flytekit:py3.9-latest>\": failed to do request: Head \"<https://ghcr.io/v2/flyteorg/flytekit/manifests/py3.9-latest>\": x509: certificate signed by unknown authority" image="<http://ghcr.io/flyteorg/flytekit:py3.9-latest|ghcr.io/flyteorg/flytekit:py3.9-latest>"
Note, this error is in the Flyte Sandbox container that gives me the initial error ^^^
j

jeev

04/14/2023, 4:00 PM
i cant repro in a fresh sandbox. @Samuel Bentley: this probably sounds dumb, but can you restart your docker daemon and fire up a new sandbox? i definitely had weird issues in older docker desktop for mac versions, but havent in awhile.
alternatively, you can also run running a basic pod like:
Copy code
kubectl run -it --rm --image=debian debian sh
see if that results in a similar issue
s

Samuel Bentley

04/17/2023, 9:22 AM
Restarting didn't work, but kubectl did. I'm going to raise a ticket with Docker. Just so I can give them all the details. Is the sandbox puling the image a Docker-in-Docker, or is it pulling from within Kubernetes?
j

jeev

04/17/2023, 12:06 PM
it’s running k3s on containerd within the sandbox container.
k

Ketan (kumare3)

04/17/2023, 1:13 PM
Cc @Chirayu Gupta looks like a similar problem
s

Samuel Bentley

04/17/2023, 3:41 PM
Thanks, I've raised a ticket with Docker, let's see what they say
Docker support pointed me to a K3s issue, that seemed to fix things for other people experiencing the issue with K3s. Is there anything I can do to my env to get this fixed? https://github.com/k3s-io/k3s/issues/1148
j

jeev

04/17/2023, 8:29 PM
it’s not a private repo though.
s

Samuel Bentley

04/20/2023, 10:32 AM
Docker gave up on helping me. I'm going to raise a bug with K3s. Can you guys help me with this question please (in the context of the sandbox)? Cluster Configuration: <!-- Provide some basic information on the cluster configuration. For example, "3 servers, 2 agents". -->
j

jeev

04/20/2023, 12:39 PM
@Samuel Bentley i’d love to spend some time digging into this with you. Will DM.
we have root caused this issue. the problem was that zscaler was running on the machine and intercepting requests. we had to add the org-specific zscaler root ca to the sandbox to establish trust with
<http://ghcr.io|ghcr.io>
. ill open a PR to enable users to inject additional trusted certs into the sandbox later.
k

Ketan (kumare3)

04/20/2023, 2:25 PM
@Samuel Bentley this is what I was saying like a proxy / firewall on your machine
j

jeev

04/20/2023, 7:01 PM
@Samuel Bentley: would you be open to testing this just to be sure?
s

Samuel Bentley

04/20/2023, 7:02 PM
Yeah you were right. I wonder if it’s down to the Mac chip docker version. Everything was the same on my old Mac (intel chip) and it worked fine.
Sure, I’ll try it out in the morning
j

jeev

04/20/2023, 7:03 PM
cool. if its not merged by then, I'll trigger a separate build that you can test with.
try placing the root CA pem in
~/.flyte/sandbox/ca-certificates/
and run:
Copy code
flytectl demo start --image=<http://ghcr.io/flyteorg/flyte-sandbox-bundled:sha-dd75b80cda29a0be441806c3372c6cd46a35dbd9|ghcr.io/flyteorg/flyte-sandbox-bundled:sha-dd75b80cda29a0be441806c3372c6cd46a35dbd9>
s

Samuel Bentley

04/21/2023, 10:44 AM
Yep, that's working! It did take a long time (3 mins to do the greeting wf) though. Don't know if it's related to the change or not
j

jeev

04/21/2023, 12:48 PM
Ok good to know it’s working. The image is running a nightly version of Flyte.
s

Samuel Bentley

04/21/2023, 12:49 PM
Thanks for your help again 🙂
j

jeev

04/21/2023, 12:49 PM
It might be because of an initial image pull. Is it faster if you run it again?
s

Samuel Bentley

04/21/2023, 12:50 PM
Yep, just took 14secs instead of 3 mins
j

jeev

04/21/2023, 12:51 PM
Ok. Was likely just pulling down the image.
k

Ketan (kumare3)

04/21/2023, 2:12 PM
Soon it will be lower after the next release
2 Views