```import typing import pandas as pd from flyte...
# ask-the-community
r
Copy code
import typing
 
import pandas as pd 
from flytekit import ImageSpec, Resources, task, workflow

pandas_image_spec = ImageSpec(
    base_image="ghcr.io/flyteorg/flytekit:py3.9-latest",
    packages=["pandas", "numpy"],
    python_version="3.9",
    apt_packages=["git"],
    env={"Debug": "True"},
    registry="ghcr.io/flyteorg",
)

sklearn_image_spec = ImageSpec(
    base_image="ghcr.io/flyteorg/flytekit:py3.9-latest",
    packages=["scikit-learn"],
    registry="ghcr.io/flyteorg",
)

if sklearn_image_spec.is_container():
    from sklearn.linear_model import LogisticRegression


@task(container_image=pandas_image_spec)
def get_pandas_dataframe() -> typing.Tuple[pd.DataFrame, pd.Series]:
    df = pd.read_csv("<https://storage.googleapis.com/download.tensorflow.org/data/heart.csv>")
    print(df.head())
    return df[["age", "thalach", "trestbps", "chol", "oldpeak"]], df.pop("target")


@task(container_image=sklearn_image_spec, requests=Resources(cpu="1", mem="1Gi"))
def get_model(max_iter: int, multi_class: str) -> typing.Any:
    return LogisticRegression(max_iter=max_iter, multi_class=multi_class)


# Get a basic model to train.
@task(container_image=sklearn_image_spec, requests=Resources(cpu="1", mem="1Gi"))
def train_model(model: typing.Any, feature: pd.DataFrame, target: pd.Series) -> typing.Any:
    model.fit(feature, target)
    return model


# Lastly, let's define a workflow to capture the dependencies between the tasks.
@workflow()
def wf():
    feature, target = get_pandas_dataframe()
    model = get_model(max_iter=3000, multi_class="auto")
    train_model(model=model, feature=feature, target=target)


if __name__ == "__main__":
    wf()
When i register the wf
Copy code
pyflyte register workflows/image_spec.py
Copy code
Image ghcr.io/flyteorg/flytekit:3ADTed3jxN2hwtMwkTdzmA.. not found. Building...
Run command: envd build --path /tmp/flyte-mtww9jz_/sandbox/local_flytekit/d5987e0040556b03793dfa6877733305  --platform linux/amd64 --output type=image,name=ghcr.io/flyteorg/flytekit:3ADTed3jxN2hwtMwkTdzmA..,push=true 
v0.10.6: Pulling from moby/buildkit
59bf1c3509f3: Pulling fs layer
is this said that, the image spec builed is not based on ghcr.io/flyteorg/flytekit:py3.9-latest?
s
it is. it's first pulling buildkit image.
r
Copy code
pyflyte register workflows/image_spec.py
...
#20 [internal] pip install pandas numpy
#20 0.807 Requirement already satisfied: pandas in /usr/local/lib/python3.9/site-packages (1.5.3)
#20 0.807 Requirement already satisfied: numpy in /usr/local/lib/python3.9/site-packages (1.25.2)
#20 0.817 Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.9/site-packages (from pandas) (2023.3.post1)
#20 0.818 Requirement already satisfied: python-dateutil>=2.8.1 in /usr/local/lib/python3.9/site-packages (from pandas) (2.8.2)
#20 0.821 Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.9/site-packages (from python-dateutil>=2.8.1->pandas) (1.16.0)
#20 1.883 WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: <https://pip.pypa.io/warnings/venv>
#20 2.272
#20 2.272 [notice] A new release of pip is available: 23.0.1 -> 23.3.1
#20 2.272 [notice] To update, run: pip install --upgrade pip
#20 DONE 2.4s
#21 exporting to image
#21 exporting layers
#21 exporting layers 15.2s done
#21 exporting manifest sha256:e47006a8cfd5d790a089109259d3ccf3d695e85a0fffde9a61b934bc9c7b16c7 0.0s done
#21 exporting config sha256:0cbb93768cd2fe79458ce0062b1ef75a77dcbf89ab80a75fd404410482eb554f 0.0s done
#21 pushing layers
#21 pushing layers 0.5s done
#21 ERROR: failed to authorize: failed to fetch anonymous token: unexpected status from GET request to <https://ghcr.io/token?scope=repository%3Aflyteorg%2Fflytekit%3Apull%2Cpush&service=ghcr.io>: 403 Forbidden
------
> exporting to image:
------
Failed with Unknown Exception <class 'Exception'> Reason: failed to run command envd build --path /tmp/flyte-y7ebo1dt/sandbox/local_flytekit/438ade51efebce7fa3bce718b2423956  --platform linux/amd64 --output type=image,name=<http://ghcr.io/flyteorg/flytekit:3ADTed3jxN2hwtMwkTdzmA..,push=true|ghcr.io/flyteorg/flytekit:3ADTed3jxN2hwtMwkTdzmA..,push=true> with error b'error: failed to fetch anonymous token: unexpected status from GET request to <https://ghcr.io/token?scope=repository%3Aflyteorg%2Fflytekit%3Apull%2Cpush&service=ghcr.io>: 403 Forbidden\n'
failed to run command envd build --path /tmp/flyte-y7ebo1dt/sandbox/local_flytekit/438ade51efebce7fa3bce718b2423956  --platform linux/amd64 --output type=image,name=<http://ghcr.io/flyteorg/flytekit:3ADTed3jxN2hwtMwkTdzmA..,push=true|ghcr.io/flyteorg/flytekit:3ADTed3jxN2hwtMwkTdzmA..,push=true> with error b'error: failed to fetch anonymous token: unexpected status from GET request to <https://ghcr.io/token?scope=repository%3Aflyteorg%2Fflytekit%3Apull%2Cpush&service=ghcr.io>: 403 Forbidden\n'
@Samhita Alla
Copy code
token: unexpected status from GET request to <https://ghcr.io/token?scope=repository%3Aflyteorg%2Fflytekit%3Apull%2Cpush&service=ghcr.io>: 403 Forbidden
\
s
could you share your imagespec definition?
r
@Samhita Alla the same as above. Is this refer to docker login ghcr.io
s
oh sorry. can you modify registry to your github registry?
r
That mean i cannot use registry = ghcr.io/flyteorg
s
yes. you cannot use it to push.
r
Oh, let me try again
@Samhita Alla I got this when try localhost:30000 registry
Copy code
...
=> ERROR exporting to image                                                                                                                                                                                         13.2s
 => => exporting layers                                                                                                                                                                                              13.2s
 => => exporting manifest sha256:72967fb714481b0392254090e1be0395b64ba5e8cfb153aecb0e99a151c347a6                                                                                                                     0.0s
 => => exporting config sha256:5b14d4f75d9eff196c3a8dc2fb8580f32556f67e92d92706af373905f6b354f7                                                                                                                       0.0s
 => => pushing layers                                                                                                                                                                                                 0.0s
------
 > exporting to image:
------
error: failed to do request: Head "<http://localhost:30000/v2/flytekit/blobs/sha256:5b14d4f75d9eff196c3a8dc2fb8580f32556f67e92d92706af373905f6b354f7>": dial tcp 127.0.0.1:30000: connect: connection refused
Here is the content of above link
Copy code
{
  "errors": [
    {
      "code": "BLOB_UNKNOWN",
      "message": "blob unknown to registry",
      "detail": "sha256:5b14d4f75d9eff196c3a8dc2fb8580f32556f67e92d92706af373905f6b354f7"
    }
  ]
}
--------- Here is my image spec
Copy code
pandas_image_spec = ImageSpec(
    base_image="<http://ghcr.io/flyteorg/flytekit:py3.8-1.6.2|ghcr.io/flyteorg/flytekit:py3.8-1.6.2>",
    packages=["pandas", "numpy"],
    python_version="3.9",
    apt_packages=["git"],
    env={"Debug": "True"},
    registry="localhost:30000",
)
i'm testing on sandbox
s
i'm encountering the same issue. @Kevin Su any idea how to resolve this? should we still follow the steps enclosed in https://github.com/flyteorg/flytesnacks/pull/1001/files PR?
r
@Samhita Alla can you run with above step?, I have try but still cannot push image to registry.
The image can build succesfully but cannot push to registry by envd.
s
@Ryuu looks like you need to run this command:
envd context create --name flyte-sandbox --builder tcp --builder-address localhost:30003 --use
r
@Samhita Alla Wow, it can run now, I wander about the port:30003 and tcp , Can you explain this for me?
s
i think we're just updating the builder address to port 30003; here's more about the args: https://envd.tensorchord.ai/teams/context.html#the-anatomy-of-a-context
r
So the builder is different port with registry which is port 30000 , it's dependent component with registry ? sounds weird. I will try to deep dive in this later