Hi, I am starting to do some PoC of Flyte, and get...
# ask-the-community
y
Hi, I am starting to do some PoC of Flyte, and get a basic question. What is the normal way to organize code in Flyte? Is there any document? • I followed the document creating a flyte project to create a project on my laptop. • I tried to organize some code in another python file and would like to access the modules of this python file from
example.py
. If I run
python example.py
, it will work correctly. If I run
pyflyte run (--remote) example.py wf
, I will get this error message
Failed with Unknown Exception <class 'ModuleNotFoundError'> Reason: No module named 'mike'
. • I also tried to move these files to the
my_project
directory and use
setup.py
to build and install the package in the Dockerfile, and I got the same error. • Plus, I did not see
docker_build.sh
in the directory after executing
pyflyte init my_project
• Please check more details in the thread. Thanks for your guidance.
Copy code
(my_project) ➜  my_project git:(main) ✗ tree
.
├── Dockerfile
├── LICENSE
├── README.md
├── requirements.txt
└── workflows
    ├── __init__.py
    ├── __pycache__
    │   ├── __init__.cpython-311.pyc
    │   └── example.cpython-311.pyc
    ├── example.py
    └── mike
        ├── __pycache__
        │   └── demo.cpython-311.pyc
        └── demo.py
Copy code
(my_project) ➜  my_project git:(main) ✗ cat workflows/mike/demo.py
def add(a, b):
    return a + b

def minus(a, b):
    return a - b
Copy code
"""A simple Flyte example."""

import typing
from flytekit import task, workflow

from mike.demo import add, minus

@task
def say_hello(name: str) -> str:
    """A simple Flyte task to say "hello".

    The @task decorator allows Flyte to use this function as a Flyte task, which
    is executed as an isolated, containerized unit of compute.
    """
    print(f"1 + 2 = {add(1, 2)}")
    return f"hello {name}!"


@task
def greeting_length(greeting: str) -> int:
    """A task the counts the length of a greeting."""
    print(f"3 - 1 = {minus(3, 1)}")
    return len(greeting)

@workflow
def wf(name: str = "union") -> typing.Tuple[str, int]:
    """Declare workflow called `wf`.

    The @workflow decorator defines an execution graph that is composed of tasks
    and potentially sub-workflows. In this simple example, the workflow is
    composed of just one task.

    There are a few important things to note about workflows:
    - Workflows are a domain-specific language (DSL) for creating execution
      graphs and therefore only support a subset of Python's behavior.
    - Tasks must be invoked with keyword arguments
    - The output variables of tasks are Promises, which are placeholders for
      values that are yet to be materialized, not the actual values.
    """
    greeting = say_hello(name=name)
    greeting_len = greeting_length(greeting=greeting)
    return greeting, greeting_len


if __name__ == "__main__":
    # Execute the workflow, simply by invoking it like a function and passing in
    # the necessary parameters
    print(f"Running wf() { wf(name='passengers') }")
And I did not change anything in
Dockerfile
Copy code
(my_project) ➜  my_project git:(main) ✗ python workflows/example.py
1 + 2 = 3
3 - 1 = 2
Running wf() DefaultNamedTupleOutput(o0='hello passengers!', o1=17)
(my_project) ➜  my_project git:(main) ✗ pyflyte run workflows/example.py wf
Failed with Unknown Exception <class 'ModuleNotFoundError'> Reason: No module named 'mike'
No module named 'mike'
v
Hello, I’m also new to Flyte and had similar issues Does this behavior change depending on which folder is your current directory? Does it still happen when you are running from the
workflows
folder? I also set up a flyte PoC last month, I’ll share an alternative way to handle these issues which worked for me. I don’t work with pyflyte cli, instead I do everything from python There’s a FlyteRemote class that lets you interact with the flyte remote from python. I also had some import issues while I was setting up my PoC, and in the end I settled on using
FlyteRemote.register_script
in my class that wraps around FlyteRemote:
Copy code
class FlyteClient:
    def __init__(self, endpoint: str, project: str, **kwargs):
    ...

    self.remote = FlyteRemote(
            config=Config.for_endpoint(endpoint=self.endpoint),
            default_project=self.project,
            default_domain=self.domain,
        )
    ...

    def register_script(self, workflow: FlyteWorkflow) -> FlyteWorkflow:
        main_script_path = os.path.abspath(__main__.__file__)
        source_dir = os.path.dirname(main_script_path)

        # some custom use-case-specific logic
        if os.path.basename(source_dir) == "scripts":
            source_path = os.path.dirname(source_dir)
        else:
            source_path = source_dir

        return self.remote.register_script(
            entity=workflow,
            version=self._uid,
            #module_name=workflow.name,
            source_path=source_path,
            copy_all=True
        )
In this example source_path lets you decide which folder of python code files will be copied into the remote container that will execute the task. The files will be available for importing and running during the execution. This specific example lets you run python example.py from anywhere and it will rely on the file’s path instead of your local path where you are currently cd’ed to in your command line session, so you can have more consistent behavior. This example doesn’t take into account the case when you are running directly from python console (where there is no
__main__.___file___
but it can be adjusted to handle that case also if needed register_script is cool because if you pass a FlyteWorkflow to it, it will register both the workflow and all its children tasks with the flyte remote in one operation, instead of doing register_workflow and then register_task multiple times. After the workflow is registered, register_script returns it as its return value, and it can then be used with FlyteRemote.execute() to start an execution Running cli commands seems to be the preferred way to interact with the flyte remote from what I’ve seen, not as many users use flyteremote compared to pyflyte based on the discussions in this channel so far. I hope someone who is experienced with the pyflyte cli gives you some guidance on that, but I wanted to let you know that this can also be done from python which may be preferred for some use cases, for example in my specific case it is much more suitable than cli commands So if you prefer working with flyte remote programmatically, that’s an option and I can help you get started with it because I’ve already set it up for myself. Whether or not you should use cli commands or FlyteRemote class is up to your preferences and specific use case Read more: https://docs.flyte.org/projects/flytekit/en/latest/design/control_plane.html
y
@Victor Churikov Thanks a lot for the detailed explanation and the offering of helping setup the environment in the python way. I would also expect some guidance from pyflite CLI perspective from the channel. And I may direct message you if I need to do it in the python way. BTW, do you have any idea how good Flyte is if we run it locally? I mean can it still orchestrate a fairly complex workflow without using K8s and docker container?
v
I haven’t tried Flyte locally yet, but that’s something I plan to do next month, so I will let you know how it goes if it’s still relevant by that time Feel free to message me if you need help with FlyteRemote, but if possible we should keep discussions in this public channel so others can benefit from them later. Meanwhile while you’re waiting for help with pyflyte I recommend searching through https://discuss.flyte.org/ which is an archive of this slack workspace, because slack messages expire after a while, so you can find more information on this archive and maybe there’s something relevant to your specific issue that can help
n
hi @Yuan Wang (Mike) re: the error:
Copy code
No module named 'mike'
You’ll need to provide the fully qualified module name in your import:
Copy code
from workflows.mike.demo import ...
About the recommended way of organizing projects, you can see example projects in the flytekit-python-template repo. The
simple-example
directory contains the cookie cutter flyte project that you get when you do
pyflyte init …
I just added back the
docker_build.sh
script in there.
We’re currently transitioning all of our examples to use ImageSpec, hence why you didn’t see it before, but I just added it back for now.
BTW, do you have any idea how good Flyte is if we run it locally? I mean can it still orchestrate a fairly complex workflow without using K8s and docker container?
You can use the local sandbox for initial testing and debugging, and you can certainly do things like schedule workflows on your local computer… but you won’t get a lot of the benefits of scale and reproducibility doing that.
y
Thanks a lot, Niels. Is there any document describing the architecture or functionalities of the local mode? We are thinking about creating a python library and building a SaaS solution on top of Flyte. That's why both modes are interesting for us.
248 Views