Hello dear Flyte community I am starting with Flyte and coul Flyte #flyte-support

Hello dear Flyte community ! I am starting with Fl...

elegant-intern-81155

07/24/2023, 3:17 PM

Hello dear Flyte community ! I am starting with Flyte and could experiment a bit with it. There are still some blurry parts for me tho. Maybe you will be able to help me 🙂 The steps for the development to the workflow registration are, for me : • Python venv with all necessary dependences • create the python filewith the worflow, defining the needed docker images • register the workflow specifying the docker image if necessary The problem here is that, There is kinda a duplicate work with the docker image (which included all the dependences), and the python venv (which also need to include all the dependencies). Is there a way, to do this work only once ? I was thinking to build the docker image, and register the workflow inside the docker container, but there's some errors, even with host network. Does anyone have a better way to develop workflows ? Or maybe you will explain how the venv+container way of developing is the best way to do so ahah 😉 Thank you ! 🙂

glamorous-carpet-83516

07/24/2023, 3:28 PM

you could use image spec, new feature in flytekit, which enables you to build a docker image without dockerfile. https://docs.flyte.org/projects/cookbook/en/latest/auto_examples/image_spec/image_spec.html

elegant-intern-81155

07/24/2023, 3:29 PM

But I still need to work from a environement with all the dependencies right ?

glamorous-carpet-83516

07/24/2023, 3:31 PM

yes. I guess you want flyte somehow copy the dependencies in the local venv to the container, so you don’t need specify all the dependencies in the dockerfile or image spec?

elegant-intern-81155

07/24/2023, 3:33 PM

I am just used to workflow engine, where you work inside a container, and you just tell the workflow engine to use that container. You kinda have 2 times the same environement in Flyte if I understand well how it works

elegant-intern-81155

07/24/2023, 3:34 PM

But I see, I will try this image spec functionality it might save a bit of time 🙂

elegant-intern-81155

07/24/2023, 4:02 PM

And above all, if you install something from source, in a more complex way than just a pip install or similar, you have to do it in the dockerfile AND in the developement environement you will register with, which is a bit time consuming

thankful-minister-83577

07/24/2023, 4:28 PM

you can register from within the container yes.

thankful-minister-83577

07/24/2023, 4:29 PM

we used to do this at lyft… but it didn’t match our users workflow too well. image building took too long, and almost everyone had a local version of dependencies installed.

thankful-minister-83577

07/24/2023, 4:30 PM

and for large workflows with different requirements, users could then split up their workflows so that different tasks ran with different images.

thankful-minister-83577

07/24/2023, 4:30 PM

if you want to do this just have to figure out the networking. what’s the network error?

elegant-intern-81155

07/25/2023, 8:16 AM

Hi @thankful-minister-83577 I'm gonna try some more with the docker container used to run the entire workflow, and tell you if I encounter any difficulties. When you say for large workflow, using different images, how would you do it ? As when registering the worflow, you need to do it in an environnement containing allll dependencies. What if one task running with a specific image has conflict with another task running with a different image ?

elegant-intern-81155

07/25/2023, 9:45 AM

When registering the workflow, it seems that pyflyte struggles with symlink. It says that a file is not existing, but it does through symlink. Have you experienced that ? Is there a way to overcome that issue ? I might post it outside this thread as it may help others

thankful-minister-83577

07/25/2023, 4:16 PM

symlink follow doesn’t exist i don’t think. it’s something we need to add

thankful-minister-83577

07/25/2023, 4:17 PM

“environnement containing allll dependencies” yeah exactly. it was just that in our experience users tend to have a giant local env with everything.

thankful-minister-83577

07/25/2023, 4:19 PM

conflict - split the registration flow into two steps/based on two envs? or maybe if you can stand the cognitive dissonance and the registration part works fine, then register with the wrong version for one of the tasks (knowing that at execution time, the correct version will be used for both)

elegant-intern-81155

07/26/2023, 8:54 AM

Thank you for your answers!

elegant-intern-81155

07/26/2023, 8:59 AM

Concerning the use of the docker image itself to register the workflow. The error I am facing is :

status = StatusCode.UNAVAILABLE

details = "failed to connect to all addresses; last error: UNAVAILABLE: ipv4:127.0.0.1:30080: recvmsg:Connection reset by peer"

debug_error_string = "UNKNOWN:failed to connect to all addresses; last error: UNAVAILABLE: ipv4:127.0.0.1:30080: recvmsg:Connection reset by peer

Entering the docker running (sharing host networking with the container):

docker run -it --net=host --mount type=bind,source=/home/<user>,target=/home/<user> localhost:30000/<image> bash

And registering the workflow doing :

pyflyte --verbose register /home/<user>/Documents/data/Flyte/flytesnacks/venvtensorflow/extractWf/WorkflowExtraction.py --image localhost:30000/<image>:latest

I don't know if you remember how you were doing the networking when using the docker container to register the workflow? Thank you again 🙂

20 Views

Open in Slack

Previous Next