Hello! I have a question regarding multi-image wor...
# ask-the-community
m
Hello! I have a question regarding multi-image workflows. We have task1 which is simply a tensorflow job and task2 is a spark job. When we run
pyflyte package
it fails because it does not have all the dependencies --> flytekitplugins-spark + tensorflow. I thought by specifying
container_image
differently for those tasks and importing those modules within the task context was magically solving this problem but not? Do we need to install all python source code dependencies when we run
pyflyte package
command?
k
yes, you need all dependencies locally to compile the workflow. otherwise, python will failed to import module
btw, there is a new feature (imageSpec) in flytekit that make building image easier. https://flyte--988.org.readthedocs.build/projects/cookbook/en/988/auto/core/containerization/image_sepc_example.html you can also use multi-imageSpec workflows. feel free to give it a shot
m
This looks quite promising but when will it build the image? does it support using a private docker image registry and private pip index? is it possible to specify package versions?
k
• before registration. when you run
pyflyte run
, it will build the image first if image not found. • it uses docker under the hood, so I think it should support private pip as well. I haven’t try it, but let me know if you run into any issue. • yes, you can specify package version. like`pandas==1,4.0`
m
Alright, thank you!
@Kevin Su does this mean that all registrations will be fast registration by default?
k
yes, pyflyte register use fast register by default
m
I’ve been testing imageSpec but when I run the pyflyte register command, it builds the image many times. I have set an image_spec to multiple tasks and think it’s building the image for each task. I see this log multiple times, with the same image name and version, am I doing something wrong?
Copy code
Image flytekit:_CB4mqOFLHJIgtJSHe9hyQ.. not found. Building...
Run command: envd build --path /var/folders/l5/b5c4dzc12n3cd4pbyrhkwxd40000gp/T/flyte-3wc4a3dt/sandbox/local_flytekit/833f8fa328235df82b6a50a4958bb4fd
k
which version of flytekit you are using
could share you image spec? it shouldn’t rebuild the image
m
1.6.0b4
and
Copy code
tensorflow_image_spec=ImageSpec(
    base_image="tensorflow/tensorflow:2.12.0-gpu",
    python_version="3.9"
)
I see that push=true is added when the registry is specified only
Copy code
if image_spec.registry:
            command += f" --output type=image,name={image_spec.image_name()},push=true"
I’ll add it and see
@Kevin Su the docs page you've shared is gone and can't find it in the latest docs, could you share it again?
157 Views