This seems like a bug with `pyflyte package` - `py...
# ask-the-community
s
This seems like a bug with
pyflyte package
-
pyflyte run
seems to work fine when I add the project root path to
sys.path
s
I don't think you need to modify
sys.path
. Can you add an
__init__.py
file to
wf
directory and run your pyflyte package command? Let me know if you're seeing any error. Your PYTHONPATH has to be the root directory where you have tasks and workflows.
s
Yes, there’s
__init__.py
but I’m not able to run
pyflyte package
where wf and tasks are. As you can see in the project structure, I had to run it two levels up from where the wf code is because I couldn’t get it to run otherwise. What would be
--pkgs
value be if I run it in the same dir as the wf code? I tried using
.
for current dir, but that doesn’t work.
s
Yeah, it has to be one level up. What's the error you're seeing?
s
OK so I tried running it where the wf file is and got it to package/register, but execution is failing. Here’s my directory structure and pyflyte command
pyflyte --pkgs wf,main,src package --source project/wf_22_142
Copy code
/project
  /wf_22_142
    /src
      helpers.py
    main.py # where I do `from src.helpers import some_fn`
    wf.py # where @workflow is where I do `from main import task_fn`
pyflyte
is executed from parent of the
/project
dir
Execution fails with an error:
ModuleNotFoundError: No module named 'src'
This is strange since I thought
--source
option should set the given dir as the project root for imports
It’s also strange because that seems to be the case for for
wf.py
where I
from main import task_fn
and it seems to work without an issue. Only
main.py
importing from
src
fails
Any ideas?
Still stuck on this - tried various file structures, but importing doesn’t work as expected
Locally this works so I don’t think it’s python import issue - flyte must be doing something underneath that’s breaking imports from subdirectories - unless I’m doing something wrong with pyflyte
I tried many combinations and structures, but can’t seem to make it work - can you provide me with a working example all the way to a successful execution where
wf.py
imports a task from
task.py
which in turn imports some functions from a subdirectory
src/helpers.py
(e.g. in
task.py
,
from src.helpers import some_fn
)? @Samhita Alla
I’ve spent two days on this and still can’t get it to work
I can’t seem to find any examples in the docs either
s
@seunggs, sorry that you've been trying to fix this for two days. I got it working and here's the directory structure, code and commands I ran. project │ ├── Dockerfile │ └── wf │ ├── init.py │ ├── main.py │ ├── src │ │ └── helpers.py │ └── wf.py helpers.py
Copy code
def sum(a, b):
    return a + b
main.py
Copy code
from .src.helpers import sum
from flytekit import task


@task
def main_task(a: int, b: int) -> int:
    return sum(a, b)
wf.py
Copy code
from flytekit import workflow
from .main import main_task


@workflow
def wf(a: int = 10, b: int = 9):
    return main_task(a=a, b=b)
Dockerfile
Copy code
FROM <http://ghcr.io/flyteorg/flytekit:py3.9-latest|ghcr.io/flyteorg/flytekit:py3.9-latest>

# Copy the actual code
COPY wf /root/project/wf
Ran these two commands in `project`'s parent directory:
Copy code
pyflyte --pkgs project package --image <http://ghcr.io/samhita-alla/flyte-dir-structure:0.0.1|ghcr.io/samhita-alla/flyte-dir-structure:0.0.1> -f
flytectl register files --project flytesnacks --domain development --archive flyte-package.tgz --version v1
s
Hi @Samhita Alla thanks a lot for taking the time to create this example - much appreciated! A quick question though - we use the flyte packaging process as part of a CI/CD pipeline for our clients and typically our clients create projects with wf.py/main.py at the top of their project dir. This causes issues with relative imports (
ImportError: attempted relative import with no known parent package
) - is there any way to not use relative imports? i.e.
from main import main_task
and
from src.helpers import sum
rather than
from .main…
and
from .src…
?
We’d like to maintain as much parity as possible from client code in dev as production deployment process and not have to enforce a specific dir structure for client’s project code - if at all possible
Is there anyway to make the
/project/wf
dir the package root for python imports, for example?
Thanks again for your help with this!
(Also, I just tried the above code, but still not working:
Copy code
Loading packages ['project'] under source root /source
Failed with Unknown Exception <class 'ModuleNotFoundError'> Reason: No module named 'src'
No module named 'src'
where
/source
is your
/root
)
s
Can you share your Dockerfile?
s
Sure (note that the
sidetrek/base-flyte
image this Dockerfile is based on just has flytectl installed and flyte-config file):
Copy code
# Build stage
FROM sidetrek/base-flyte:latest as build

WORKDIR /root
ENV VENV /opt/venv
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
ENV PYTHONPATH /root

# Make sure to use venv
ENV PATH="$VENV/bin:$PATH"

COPY ./project/requirements.txt /root
# Add --no-cache-dir to prevent OOMKilled
RUN pip install --no-cache-dir -r /root/requirements.txt

# Production stage
FROM sidetrek/base-flyte:latest

WORKDIR /root
ENV VENV /opt/venv
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
ENV PYTHONPATH /root

# Add flytectl to PATH
ENV PATH="/bin/flytectl:$PATH"

# Make sure to use venv
ENV PATH="$VENV/bin:$PATH"

# Copy dependencies from build stage
COPY --from=build /opt/venv /opt/venv

# Copy the actual code (again, user's project code)
COPY . /root

# This tag is supplied by the build script and will be used to determine the version
# when registering tasks, workflows, and launch plans
ARG tag
ENV FLYTE_INTERNAL_IMAGE $tag
Please keep in mind that this is running inside a CI/CD pipeline (Tekton pipelines to be more specific)
s
@Eduardo Apolinario (eapolinario), is it possible to package tasks and workflows without using relative imports, when the code is spread across multiple modules?
Could you send me your directory structure?
e
@Samhita Alla, sorry, relative imports should work. Also curious about the dir structure. @seunggs, do you have a
___init___.py
file in
src
?
s
Yes there’s
__init__.py
in
/src
Also, I’d much prefer to not use relative import if it’s at all possible - I think it’s strongly preferred to have a parity between dev and production and the user shouldn’t have to adhere to a specific project dir structure to accomodate flyte deployment (or be coerced to use relative imports when locally, they can run the code without them).
To make it more clear - we have a user code like this:
Copy code
/src
  __init__.py
  helpers.py
main.py # `from src.helpers import some_fn`
wf_x.py # `from main import task1`
In the CI/CD process, we copy this code into
/project/wf
to accomodate flyte packaging/registration:
Copy code
/project
  /wf
    /src
      __init__.py
      helpers.py
    main.py
    wf_x.py
  __init__.py
  Dockerfile
  requirements.txt
And we’re currently running
pyflyte --pkgs wf_x package --source project/wf
from parent dir of the
/project
, which works fine except during execution, we get no module named
src
error
@Eduardo Apolinario (eapolinario)
s
You're packaging
wf_x
but you need to access
src
while executing your code. Try packaging
wf
.
s
But if you set the --source to
project/wf
, this errors out saying there’s no module named
wf
(probably because you’re already in that dir)
Also tried what you suggested by changing the --source to
project
, and I get
No module named 'src'
error
s
Can you not specify
--source
, and can you run this command in `project`'s parent directory?
Copy code
pyflyte --pkgs project package --image <your-image> -f
The
__init__.py
file needs to be present in
wf
, not
project
. Ensure the Dockerfile is copying the code in
wf
to
/root/project/wf
.
s
@Samhita Alla Just tried it and same error:
Copy code
Loading packages ['project'] under source root /workspace/source
No module named 'src'
Failed with Unknown Exception <class 'ModuleNotFoundError'> Reason: No module named 'src'
/workspace/source
is the root folder inside CI/CD where
/project
is. Confirmed that the structure is this inside the CI/CD:
Copy code
/project
  /wf
    /src
      __init__.py
      helpers.py
    __init__.py
    main.py
    wf.py
And ran
pyflyte --pkgs project package …
without
--source
from parent of
/project
. Using relative import - i.e. in
main.py
,
from .src.helpers import some_fn
Wonder why relative imports don’t work for me
Seems like exact same setup you sent over yesterday but erroring out
s
It should work! Also if you're using a relative import, the error needs to be no module name ".src". Not sure why you're seeing "src".
s
Hmm that’s an interesting point - so it’s not packaging /src folder for some reason. But I’m setting --pkgs to
project
- shouldn’t that package all the subdirectories and files?
Any suggestions for how to debug this? I’d really like to figure this out this week if possible
I’ve also specified
src
directly as a package via
--pkgs
option too but that doesn’t seem to work either. It only seems to package
.py
files?
Is it necessary to specify files rather than directories for
--pkgs
?
s
No, you can specify a directory. I'm just wondering if the CI/CD is picking up the relative import. Can you try it locally and see if that works?
@Eduardo Apolinario (eapolinario), do you have any suggestions regarding this?
Also, I’d much prefer to not use relative import if it’s at all possible - I think it’s strongly preferred to have a parity between dev and production and the user shouldn’t have to adhere to a specific project dir structure to accommodate flyte deployment (or be coerced to use relative imports when locally, they can run the code without them).
s
@Samhita Alla Testing relative import locally throws this error:
attempted relative import with no known parent package
I’ll try to run it in the similar dir structure as the CI/CD
OK same error locally using
pyflyte package
and
pyflyte run
-
No module named 'src'
. Running from parent of the
/project
in the above dir structure
So it doesn’t look like its CI/CD setup problem
s
Can we connect on a call?
Let me know your preferred time.
s
Hi @Samhita Alla - I can chat tomorrow anytime between 8-11am, 3-5pm, 6-8pm PT. But before we get on a call, I just wanted to recap the problem at hand. To recap, due to
attempted relative import with no known parent package
error, I wasn’t able to get relative import working. For various reasons, I don’t think we can use the relative imports for our needs. The only way I could get this to work was to do something like
from <http://project.wf|project.wf>.src.helpers import some_fn
when I run my project in the parent dir of
project
(with no --source and
__init__.py
in
/project
and in
/wf
). This is an acceptable workaround for now, but I’d much rather be able to set
/project/wf
dir as project root for python import purposes. So I’d love to get on a call if you can help me find a way to do this, but otherwise, I am not sure I should waste any more of your time. Let me know if you think it’s still worth it to get on a call. Either way, thank you so much for your help regarding this problem. It’s much appreciated!!
s
The only way I could get this to work was to do something like
from <http://project.wf|project.wf>.src.helpers import some_fn
when I run my project in the parent dir of
project
(with no --source and
__init__.py
in
/project
and in
/wf
).
Good to know that this is working for you.
This is an acceptable workaround for now, but I’d much rather be able to set
/project/wf
dir as project root for python import purposes.
You can set
wf
as the project root if all your workflows and dependent tasks and libraries are present in the same folder. If not, this won't work. e.g. main.py
Copy code
from wf.src.helpers import sum
from flytekit import task


@task
def main_task(a: int, b: int) -> int:
    return sum(a, b)
wf.py
Copy code
from flytekit import workflow
from wf.main import main_task


@workflow
def wf(a: int = 10, b: int = 9):
    return main_task(a=a, b=b)
I'm able to successfully package code when I run
pyflyte package
command in the
project
directory. Also, I agree that relative import isn't a feasible import technique. You should either import from the project root or do relative imports. Let me know if this still isn't clear to you and we can hop on a call!
s
I think I get it but I’m just curious - why is it not possible to set
/wf
as project root if there are subdirectories? This forces a user to either have a specific dir structure (i.e.
/project/wf
), which requires a prefix for all imports (i.e.
from <http://project.wf|project.wf>.main…
) or if you make
/wf
project root, you have to not have any subdirectories, which prevents the user from having any modularized code. Either way, flyte loses parity with a regular python code.
s
There definitely can exist subdirectories in
wf
.
s
Oh I misunderstood you then - “You can set
wf
as the project root if all your workflows and dependent tasks and libraries are present in the same folder. If not, this won’t work.”
s
You can have any number of folders within
wf
and that should work!
s
You mean all the project files has to be in
wf
- but I tried this and it doesn’t work
s
Oh it should!
s
If you put
/project/wf
as a --src so it can be the project root, then what would --pkgs value be?
Copy code
/project
  /wf
    /src
      __init__.py
      helpers.py
    __init__.py
    main.py
    wf.py
I tried
--pkgs wf
as well as
--pkgs wf, main, src
and neither works
s
You'll need to set project root as
project
not
project/wf
'cause you need to package the whole
wf
directory
s
In that case,
from main …
doesn’t work
It has to be
from wf.main ...
s
Yeah!
s
So either way, all imports require prefixes
No way to do
from main …
Like you would in a local project
s
That's because when you're packaging your code from within a directory, the paths have to be relative to that directory, or relative to all other modules. That's how pyflyte traverses the directory.
Hey @Yee / @Eduardo Apolinario (eapolinario) / @Kevin Su, is there a way we can enable this? Given this project structure
Copy code
/project
  /wf
    /src
      __init__.py
      helpers.py
    __init__.py
    main.py
    wf.py
can we package code at the
wf
level by not importing code from other modules relative to
wf
? For example, if we want to import code in
wf.py
from
main.py
, we need to include
from wf.main ...
but not
from main ...
'cause the latter results in no module error.
157 Views