I have a bunch of workflows that `pyflyte pkgs mypackage pac Flyte #flyte-support

I have a bunch of workflows that `pyflyte --pkgs m...

abundant-laptop-64153

10/13/2023, 2:31 AM

I have a bunch of workflows that

pyflyte --pkgs mypackage package

and

flytectl register files

that work just fine, but ran into one today that always errors when run with an error

ModuleNotFoundError: No module named 'mypackage'

. After reading though others issue with package + register I tried adding the

--fast

flag, and voila, it works. My question is, why? Clearly I have some subtle difference between this and my other workflows, but what is

--fast

doing differently? The CLI help docs mention

Note this needs additional configuration, refer to the docs.

for

--fast

but I'm not sure what docs its referring to. It seems the difference is it may be including more of the source code than without fast. Should I just always use fast, even for those packages that clearly work without?

broad-train-34581

10/13/2023, 2:52 AM

my team used to hit into this, this might be why if you have file like

/src/pipeline/my_package

and you import

my_package

. You would need

/src/pipeline

as you pythonpath in dockerfile (usually

/root

only) when we fast-register, we noticed it copies the package to root as

/root/my_package

, which is probably why it suddenly worked. So the way to work with both register is to set

ENV PYTHONPATH="/root:src/pipeline"

abundant-laptop-64153

10/13/2023, 3:11 AM

this could be down the right path, but our structure is

Copy code

<root>/
  mypackage/
    myworkflow.py
    __init__.py

where

pyflyte package

is done within <root> the frustrating part is this identical to others that currently work in terms of structure, so some other difference must exist and as with all these kinds of issue it does not occur with

pyflyte run

pyflyte register

, but I believe that is all fast by default?

freezing-airport-6809

10/13/2023, 1:40 PM

Yes it is all fast by default

freezing-airport-6809

10/13/2023, 1:41 PM

Sadly if we change the cli we break people VC @high-accountant-32689 / @thankful-minister-83577 would like functional tests

abundant-laptop-64153

10/13/2023, 1:51 PM

I've been playing around with it to understand

fast

and I think its just an understanding issue. Below is added when run with fast to the running container:

Copy code

--additional-distribution s3://<bucket>/<project>/<domain>/7BPRMBPU3RS26QKXJIKS74XBBI======/fastc06f194bab8765ca864871b6ab6504ad.tar.gz
--dest-dir /root

Which must be taking the source code packaged by fast and adding it to the base image. The reason this workflow didn't work and our others did is we use

envd

primarily, but there are a few simple cases we don't need any extra requirements. In those cases, we just use task/workflow decorations without any image and rely on Flyte's base image. Because the development flow (pyflyte run) is fast, it just works, since the source is added to the container at runtime. The production examples don't include using fast as there must be an assumption you are pre-baking all the images with the source code (in our case envd is doing this, but could just be docker build). So

fast

really means "no pre-built container". In these simple cases pre-building a container with the source seems unecessary. I think I'll add a mechanism to conditionally package with

fast

in these scenarios in our pipeline.

abundant-laptop-64153

10/13/2023, 1:52 PM

Is there a reason why it couldn't be dynamic? If there are tasks without an overriding container_image it builds fast (or at least warns you are doing something wrong without fast)

thankful-minister-83577

10/14/2023, 2:18 PM

@abundant-laptop-64153 your understanding is spot on. i’m not sure if the production case is necessarily something we assumed though - i think there’s a fair number of users who rely on the ‘fast’ construct in prod.

thankful-minister-83577

10/14/2023, 2:20 PM

no reason it can’t be, wrt the suggestion - except that more magic might be more confusing for users. certainly feasible.

9 Views

Open in Slack

Previous Next