Yee
03/29/2022, 1:17 AM:~/dev/my_repo $ tree .
.
├── Dockerfile
└── src
├── __init__.py
├── __pycache__
└── parent
├── __init__.py
├── __pycache__
└── child
├── __init__.py
├── __pycache__
└── hello_world.py
the __module__
attribute is determined by the path:
:~/dev/my_repo $ ipython
In [1]: from src.parent.child.hello_world import my_wf
In [2]: my_wf.__module__
Out[2]: 'src.parent.child.hello_world'
:~/dev/my_repo $ cd src/parent
:~/dev/my_repo/src/parent $ ipython
In [1]: from child.hello_world import my_wf
In [2]: my_wf.__module__
Out[2]: 'child.hello_world'
Note the __module__
in the second example is shortened. This module name is used for the name of the task, and to load the task at run-time in the container (so your image has to be built correctly of course).
We are thinking of changing the behaviour so that, given a task (or launch plan or workflow), we walk up the file system until we find a folder that doesn’t have an __init__.py
file, and basically assume that the python process was started from there. In the above example, the name in the latter would be the same as in the first example. Obviously this will not work for namespace packages, but we doubt anyone uses those currently for Flyte workflows. Does this sound unreasonable to anyone?