Pardon me if this sounds like a silly question but I d like Flyte #flyte-support

Pardon me if this sounds like a silly question, bu...

freezing-mouse-40688

11/23/2023, 10:11 AM

Pardon me if this sounds like a silly question, but I'd like to understand more about containerization of tasks. I am curious to know how the dependencies for virtual environment are handled. For example, if I have 2 tasks - one requires pandas and another requires matplotlib. If I was manually putting them in 2 containers I would only package relevant libraries. So does flyte package all the dependencies for every task reading the imports in the file or somehow it magically determines the subset of dependencies that the task needs and only put them in respective container. Also what about dependencies that can't be handled by pip? What is the base OS layer of these containers? Is there a way we can use docker commands or kubectl commands to inspect our running tasks to understand them better? Thanks 🙏🙂 (if there is any online resource that covers any of the above question please feel free to share them instead of typing a long response here)

thankful-minister-83577

11/24/2023, 1:42 AM

you can customize the image that the task runs on.

thankful-minister-83577

11/24/2023, 1:43 AM

the default way initially actually was for people to build images every time. but we quickly realized that people change code a lot more frequently than they change base image requirements.

thankful-minister-83577

11/24/2023, 1:44 AM

so we added the notion of fast register which basically tacks on a layer of your code before running the task.

thankful-minister-83577

11/24/2023, 1:46 AM

and because of that, flytekit itself provides a base image, but you can always customize your own by docker build, and then specifying a

--image

arg to the register/run command, or by using the new imagespec feature, both of which can be used to compose multi-container-image workflows

freezing-mouse-40688

11/24/2023, 2:57 AM

Thank you @thankful-minister-83577 for your answer. I will have a look at your links. But, just to clarify my question was if flyte natively makes different image for different tasks and if so how does it handle the dependencies as 2 tasks may require different libraries : pandas vs matplotlib. Also, can I inspect these images - can I run them by myself in a docker container outside of flyte workflow just to experiment and get a feel for them?

2 Views

Open in Slack

Previous Next