Hi everyone! Another beginner question here. I'd l...
# ask-the-community
t
Hi everyone! Another beginner question here. I'd like to run one task in a workflow with a specific image. I found and followed these docs but ran into an error. I attached the provided example
basic_workflow.py
, but with the
t2
task decorator changed to:
Copy code
@task(container_image="python:3.7")
When I run it, I get this error:
Copy code
[f902b0296c1a94ed4ade-n1-0] terminated with exit code (128). Reason [StartError]. Message: 
failed to create containerd task: failed to create shim: OCI runtime create failed: container_linux.go:380: starting container process caused: exec: "pyflyte-fast-execute": executable file not found in $PATH: unknown.
Is it possible to use any arbitrary image as a task there? Or does the image need to follow a specific build process that includes
pyflyte-fast-execute
? Thank you!
I should also note that I am running this on GCP GKE, and running with the following command:
Copy code
pyflyte run --image <http://gcr.io/urbn-data-science/flytekit-test-wrapper:latest|gcr.io/urbn-data-science/flytekit-test-wrapper:latest> --remote workflows/basic_workflow_custom.py my_wf --a 10 --b foobar
The
<http://gcr.io/urbn-data-science/flytekit-test-wrapper:latest|gcr.io/urbn-data-science/flytekit-test-wrapper:latest>
image is a default image I built due to GCP workflow identity errors discussed here. Not sure if that is conflicting with my desired goal above or not.
k
Custom image should also install flytekit
Therefore, you have create a new docker file, and use python:3.7 as you base image, then install flytekit on it
t
ah. Thanks! That worked.
@Kevin Su A follow-up question. If I just pip-install flytekit, it's missing other key packages/config as shown in the flytekit Dockerfile:
Copy code
WORKDIR /root
ENV PYTHONPATH /root

RUN pip install awscli
RUN pip install gsutil

ARG VERSION
ARG DOCKER_IMAGE

# Pod tasks should be exposed in the default image
RUN pip install -U flytekit==$VERSION flytekitplugins-pod==$VERSION

ENV FLYTE_INTERNAL_IMAGE "$DOCKER_IMAGE"
If I wish to use a custom image, do I need to create a new image that runs all of the above as well? Otherwise I was getting GCP permission errors. And if so, does that mean this should be applied to every custom Dockerfile I wish to have?
k
@Tom Szumowski so the way most of users do it is, build one image for most of their workflows. But sometimes when you want to use more than one image, then use image_config to auto-substitue
t
@Ketan (kumare3) this is great! Thank you for the resources. Love the documentation and templating to guide best practices around larger scale management 💯.
k
@Tom Szumowski you are welcome and we are sorry if the docs are a little hard to find
please file issues in how we can improve the docs
👍 1
t
Docs have been largely great so far. But will consider that for the future
I think where I got caught for a loop was that the flytekit needs to be packed with the image. That's largely just due to my misunderstanding of how the pods were being deployed. Once I layered it in, I got a custom GPU image running with reasonable ease. 👍
k
@Tom Szumowski you have to leave Kubeflow and Argo behind - welcome to Flyte
😂 1
t
In our current pipelines (airflow, KFP), we have different images all over. I kind of like the idea of having a common project-wide default image and only customize when absolutely needed. Makes CM easier and consistent. Our pipeline tasks largely need the same packages anyway, the usual: pandas, sklearn, torch, numpy, etc. 🙂
k
yup
and the way we do it is build the images in CI
👍 1
and then iterate on it quickly using
pyflyte run / register
what we call
fast register
👍 1
e
Hey, new to Flyte, apologies for necroing this thread... Are there any alternatives (new or planned) to this? It seems a little counterproductive to require python +
flytekit
on e.g. a
ShellTask
that doesn't have anything to do with python, has anyone tried making a sidecar pattern work? We have a lot of weird tools we package up (bringing in legacy tools from another industry) in containers, and I'm a little worried about this pulling us into dependency hell.
f
@Eli Bixby have you had a look at ContainerTasks? https://docs.flyte.org/projects/cookbook/en/stable/auto/core/containerization/raw_container.html There you don't need any python of flytekit installed... you just execute a command.
e
Oh neat. Ok it was unclear from the docs that these images didn't need to have flytekit installed!
d
Thanks @Felix Ruess! container tasks work actually by injecting a sidecar that handles data transfers between the container and metastore.
e
Sweet yes, that solves my problem. We didn't know that you needed flytekit installed on python containers, and discovered it from this thread, but then assumed the same applied to container tasks.
k
[docs-issue]
[flyte-docs]
n
^^ @Ketan (kumare3) did you create a docs issue for this already?
k
Yes
778 Views