Hi Flyte team, I am trying to use the image stored...
# flyte-support
b
Hi Flyte team, I am trying to use the image stored in our gitlab container registry. I am using container started with 'flytekit demo start'. I have config-sandbox.yaml pointing to the right gitlab container registry
Copy code
images:
  # Default image for the project
  #default: <http://registry.gitlab.com/xx/xx/ns-devops/ns-uv-flytekit:1.1|registry.gitlab.com/xx/xx/ns-devops/ns-uv-flytekit:1.1>
  default: ns-uv-flytekit:1.1
I have added ImageSpec
Copy code
image_spec = ImageSpec(
    name="ns-uv-flytekit:1.1",  # default docker image name.
    registry="<http://registry.gitlab.com/xxx/xxx/ns-devops|registry.gitlab.com/xxx/xxx/ns-devops>",
)
I ran my test with pyflyte run --remote workflow.py my_flow and failed I also ran the test with image argument in pyflyte commandline:
Copy code
pyflyte run --overwrite-cache --remote --image <http://registry.gitlab.com/xxx/xxxx/ns-devops/ns-uv-flytekit:1.1|registry.gitlab.com/xxx/xxxx/ns-devops/ns-uv-flytekit:1.1> imagespectest.py hello_world_workflow
I got the following error when used --image:
Copy code
Running Execution on Remote.
Failed to check if the image exists with error:
 400 Client Error for <http+docker://localhost/v1.51/distribution/registry.gitlab.com/xxx/xxx/ns-devops/ns-uv-flytekit:1.1:oFKBBBk_A__Q_dLWRnU4_A/json>: Bad Request ("invalid reference format")
Flytekit assumes the image <http://registry.gitlab.com/qvyon/gemini/ns-devops/ns-uv-flytekit:1.1:oFKBBBk_A__Q_dLWRnU4_A|registry.gitlab.com/qvyon/gemini/ns-devops/ns-uv-flytekit:1.1:oFKBBBk_A__Q_dLWRnU4_A> already exists.
when I ran without --image argument, but imagespec is passed thru container_image to task:
Copy code
Running Execution on Remote.
Failed to check if the image exists with error:
 400 Client Error for <http+docker://localhost/v1.51/distribution/registry.gitlab.com/xxx/xxx/ns-devops/ns-uv-flytekit:1.1:oFKBBBk_A__Q_dLWRnU4_A/json>: Bad Request ("invalid reference format")
Flytekit assumes the image <http://registry.gitlab.com/xxx/xxx/ns-devops/ns-uv-flytekit:1.1:oFKBBBk_A__Q_dLWRnU4_A|registry.gitlab.com/xxx/xxx/ns-devops/ns-uv-flytekit:1.1:oFKBBBk_A__Q_dLWRnU4_A> already exists.
and in console I see the following error for task:
Copy code
containers with unready status: [aksmw88sq9ktbw6ztx6f-n0-0]|Failed to apply default image tag "<http://registry.gitlab.com/xx/xx/ns-devops/ns-uv-flytekit:1.1:oFKBBBk_A__Q_dLWRnU4_A|registry.gitlab.com/xx/xx/ns-devops/ns-uv-flytekit:1.1:oFKBBBk_A__Q_dLWRnU4_A>": couldn't parse image name "<http://registry.gitlab.com/xx/xx/ns-devops/ns-uv-flytekit:1.1:oFKBBBk_A__Q_dLWRnU4_A|registry.gitlab.com/xx/xx/ns-devops/ns-uv-flytekit:1.1:oFKBBBk_A__Q_dLWRnU4_A>": invalid reference format
Note that, I already logged into docker with docker login registry.gitlab.com. Can you please help me resolve the issue. We are evaluating Flyte currently and would definitely help moving forward I don't know how I can
a
I've seen this error when the local Docker daemon is not running
b
Thanks David for your response. I made sure the daemon is running, restarted, checked status and running logs as well. Any other suggestion. I am still getting the error.
e
I think the error in console is due to the invalid image name. Could you try setting the ImageSpec name without a colon
:
?
Copy code
image_spec = ImageSpec(
    name="ns-uv-flytekit",  # without a colon
    registry="<http://registry.gitlab.com/xxx/xxx/ns-devops|registry.gitlab.com/xxx/xxx/ns-devops>",
)
b
Thank you Nary. How can I pass tag/version then
e
b
Hello Nary. I removed the :
Copy code
image_spec = ImageSpec(
    name="ns-uv-flytekit",  # default docker image name.
    registry="registry.gitlab.com/xxx/xxx/ns-devops",
)
I am still getting in vscode/terminal.
Copy code
Running Execution on Remote.
Failed to check if the image exists with error:
 Error while fetching server API version: ('Connection aborted.', PermissionError(13, 'Permission denied'))
Flytekit assumes the image registry.gitlab.com/xxx/xxx/ns-devops/ns-uv-flytekit:_fmlLE12TEtlftipywMtIA already exists.
From same terminal, I ran sudo docker pull registry.gitlab.com/xxx/xxx/ns-devops/ns-uv-flytekit and downloaded the image successfully. Am I missing something in permissions or authentication in config-sandbox.yaml file or imagespec.
e
Could you try adding the
registry_config
that points to the path to the docker config? Default path should be
~/.docker/config.json
Copy code
image_spec = ImageSpec(
    name="ns-uv-flytekit",
    registry="<http://registry.gitlab.com/xxx/xxx/ns-devops|registry.gitlab.com/xxx/xxx/ns-devops>",
    registry_config="~/.docker/config.json",
)
b
Hi, ImageSpec->registry_config did not work. Keep in mind that I am running demo server(flytekit demo start). will image secrets apply to this scenario as well, creating secret is for kubernetes.
e
Yes image secrets should work for demo cluster, while the Flyte task is running in the k8s pod
I just tried setting the image pull secrets and it works. Simply follow this for adding `*`imagePullSecrets`*`. Note that the secret should be added in the service account in namespace
flytesnacks-development
if you are using the demo cluster
👍 1
🙏 1
b
Hi Nary, Thank you for your help .docker/config.json worked to authenticate , but the task failed as push image taken long.
Copy code
[1/1] currentAttempt done. Last Error: USER::Grace period [3m0s] exceeded|containers with unready status: [a7wldslj42wjx4wkrb5g-n0-0]|Back-off pulling image "<http://registry.gitlab.com/xxx/xxx/ns-devops/ns-uv-flytekit:_fmlLE12TEtlftipywMtIA|registry.gitlab.com/xxx/xxx/ns-devops/ns-uv-flytekit:_fmlLE12TEtlftipywMtIA>"
1. Are there time limits that we can configure, so that task won't bail out because image pull or push taking more time 2. We already have image ns-uv-flytekit:latest in our Gitlab Container Registry, but why did flyte built an image and pushed to our registry. My understand is that flyte will use the image I provided in ImageSpec. Also, how can we provide tag such as ns-uv-flytekit:latest or ns-uv-flytekit:1.1 in ImageSpec where latest or 1.1 are tags/version
ok found the answer for tag or version. and authentication issues gone: image_spec_fixed_version = ImageSpec( name="my-fixed-version-image", packages=["scikit-learn"], tag_format="v1.0.0", # Or any specific version string ) I still would like to know configuration for timelimits(point 1) above and how I can make sure that the tasks are run successfully. I even played with Resource limits to task decorator, and still has same issues
a
For #1 you can adjust
image-pull-backoff-grace-period
to something higher than the default 3m https://www.union.ai/docs/v1/flyte/deployment/configuration-reference/flytepropeller-config/#k8s-configk8spluginconfig
e
For 2. I think it's because you set the
packages
, which install the extra package to the image and this update the image. While the image is updated, flyte will try to re-build and push the new version to the registry you specified
Note that Flyte need to push the updated image to the registry so that the the pod can get it when running pyflyte run --remote
A possible work around here can be using the local registry (
localhost:30000
if you are using the demo cluster), and provide the image you want to pull from in the
base_image
In this case, after the new image with the scikit-learn installed, it is pushed to the
localhost:30000
and can be reached by the pod
Copy code
image_spec = ImageSpec(
    base_image="<http://ghcr.io/machichima/flytekit-private-image:toozpo4nnklfydlkhs8seg|ghcr.io/machichima/flytekit-private-image:toozpo4nnklfydlkhs8seg>",  # remote image you want to use
    name="flytekit-private-image",
    registry="localhost:30000",  # use local registry instead of remote one
    packages=["scikit-learn"],
)
b
Hi Nary, I was successfuly with the followig: 1. Create ImageSecret 2. Attach to serviceaccount 3. tag and push image to local:30000 4. register the workflow When I ran the flow, I am getting the following error:
Copy code
[3/3] currentAttempt done. Last Error: SYSTEM::Pod failed. No message received from kubernetes.
[a7ddq9snckq4zl4g8lx7-n0-3] terminated with exit code (128). Reason [StartError]. Message: 
failed to create containerd task: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "pyflyte-fast-execute": executable file not found in $PATH: unknown.
In this scenario I am using demo cluster. I even removed my code and just running tasks with print statements. Please advise on what am I doing wrong. Here is my script:
Copy code
from flytekit import task, workflow, FlyteDirectory, ImageSpec,dynamic
from flytekit.clis.sdk_in_container import pyflyte
import os

#<http://ghcr.io/flyteorg/flytekit:py3.11-1.10.2%7Cghcr.io/flyteorg/flytekit:py3.11-1.10.2>

image_spec = ImageSpec(
   name="ns-uv-flytekit-slim",  # default docker image name.
   registry="localhost:30000",
   tag_format="3.11",

    )


@task(cache=True, cache_version="1.0",container_image=image_spec)
def clone_imagerepo()->str:


    return 'clone_imagerepo'
@task
def sayhello(greet:str) -> str:
    """
    A simple task that returns a greeting.
    """
    return (f"{greet}")

@workflow
def hello_world_workflow() -> str:


    clone_image=clone_imagerepo()

    greeting = sayhello(clone_image)

    return "wofkflow completed"
~
and flyte is running properly:
I know what the problem is 'pyflyte-fast-execute' is missing, because my image didn't have it. I updated Dockerfile, build new image, push to local repository and was able to start the flow. I do have another issue, but I believe unrelated to flyte
👍 1
| I think the closest is this guide we used for on prem but it's extensible for "generic" K8s distribution | https://github.com/davidmirror-ops/flyte-the-hard-way/tree/main/docs%2Fon-premises%2Fsingle-node When you install flyte single binary, what will be local repository url : localhost:30000 doesn't seem to work
e
I am not familiar with this part, but I think the tutorial does not deploy the local registry? cc @average-finland-92144 do you know if there's any docs for setting up local registry when install flyte single binary?
b
how do I adjust image-pull-backoff-grace-period to e.g., 5 minutes. Note that I have setup single-binary cluster
e
If you want to build docker-registry in single binary, probably you can try setting following in values https://github.com/flyteorg/flyte/blob/640ad57cd173f868351badfdab483a9bbf48973f/charts/flyte-sandbox/values.yaml#L2-L10