03/29/2022, 1:17 AM
General question about python function task (and workflow) naming and loading conventions: We’re hoping to clean up how this works a bit. Today, they are always determined basically by the path from where the python process was started, to the location of the .py file. For example, if you have
:~/dev/my_repo $ tree .
├── Dockerfile
└── src
    ├── __pycache__
    └── parent
        ├── __pycache__
        └── child
            ├── __pycache__
attribute is determined by the path:
:~/dev/my_repo $ ipython
In [1]: from src.parent.child.hello_world import my_wf

In [2]: my_wf.__module__
Out[2]: 'src.parent.child.hello_world'

:~/dev/my_repo $ cd src/parent
:~/dev/my_repo/src/parent $ ipython
In [1]: from child.hello_world import my_wf

In [2]: my_wf.__module__
Out[2]: 'child.hello_world'
Note the
in the second example is shortened. This module name is used for the name of the task, and to load the task at run-time in the container (so your image has to be built correctly of course). We are thinking of changing the behaviour so that, given a task (or launch plan or workflow), we walk up the file system until we find a folder that doesn’t have an
file, and basically assume that the python process was started from there. In the above example, the name in the latter would be the same as in the first example. Obviously this will not work for namespace packages, but we doubt anyone uses those currently for Flyte workflows. Does this sound unreasonable to anyone?
@Miggy and @Ketan (kumare3)
@jeev too maybe when you get a chance
will be settable via switch, and will still default to the current dir in the upcoming release (we’ll default to the go-up-til-no-init-file behaviour at 1.0)
also @Eduardo Apolinario (eapolinario) this is what ketan’s pr is for