Hi, - I am testing `Remote Access` and get some b...
# ask-the-community
y
Hi, • I am testing
Remote Access
and get some basic questions. • What is the right way to specify the
entity
parameter of
register_workflow
or
register_script
? I put the function name of a workflow directly, and I got a
NoneType
error. Do I need to do some additional steps to convert a normal workflow to
WorkflowBase
class? • In addition, what is the difference between
register_script
and
register_workflow
? I checked the definition of these two functions. It seems that the main difference is that
register_script
provides an option to copy all files under the source path and correspondingly creates some
serialization_settings
automatically. Thanks
Copy code
(flyte) ➜  workflows git:(main) ✗ cat simple.py
import typing
from flytekit import task, workflow


@task
def say_hello(name: str) -> str:
    return f"hello {name}!"


@task
def greeting_length(greeting: str) -> int:
    return len(greeting)


@workflow
def wf(name: str = "union") -> typing.Tuple[str, int]:
    greeting = say_hello(name=name)
    greeting_len = greeting_length(greeting=greeting)
    return greeting, greeting_len


if __name__ == "__main__":
    # print(f"Running wf() { wf(name='passengers') }")

    from flytekit.remote import FlyteRemote
    from flytekit.configuration import Config

    remote = FlyteRemote(
        config=Config.for_sandbox(),
        default_project="flytesnacks",
        default_domain="development"
    )

    remote_wf = remote.register_workflow(
        entity=wf
    )

    remote.execute(entity=remote_wf)

(flyte) ➜  workflows git:(main) ✗ python simple.py
╭───────────────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────────────╮
│ /Users/yuanwang/projects/flyte/test_case/workflows/simple.py:34 in <module>                                                 │
│                                                                                                                             │
│ ❱ 34 │   remote_wf = remote.register_workflow(                                                                              │
│                                                                                                                             │
│ /Users/yuanwang/venvs/flyte/lib/python3.11/site-packages/flytekit/remote/remote.py:743 in register_workflow                 │
│                                                                                                                             │
│ ❱  743 │   │   ident = self._resolve_identifier(ResourceType.WORKFLOW, entity.name, version, se                             │
│                                                                                                                             │
│ /Users/yuanwang/venvs/flyte/lib/python3.11/site-packages/flytekit/remote/remote.py:543 in _resolve_identifier               │
│                                                                                                                             │
│ ❱  543 │   │   │   version=version or ss.version,                                                                           │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: 'NoneType' object has no attribute 'version'
s
Can you specify the
version
in
register_workflow
? Something like
version=v1
?
what is the difference between
register_script
and
register_workflow
?
register_script
is more of like registering a task/workflow via script mode, meaning, you need to specify the source path, module name and so on. And yes, the copy all is useful when your workflow is dependent on some modules. @Eduardo Apolinario (eapolinario), in what scenarios is
register_script
useful?
y
Can you specify the version in register_workflow? Something like version=v1?
• I added
version="v1"
, and seems that it passes the version check. However I got another error complaining serialization settings. The program is same. The only difference is that `version="v1" is added to the call of
register_workflow
. • BTW, if
version
is an optional parameter, why is it necessary to add it?
Copy code
(flyte) ➜  workflows git:(main) ✗ python simple.py
╭───────────────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────────────╮
│ /Users/yuanwang/projects/flyte/test_case/workflows/simple.py:34 in <module>                                                 │
│                                                                                                                             │
│ ❱ 34 │   remote_wf = remote.register_workflow(                                                                              │
│                                                                                                                             │
│ /Users/yuanwang/venvs/flyte/lib/python3.11/site-packages/flytekit/remote/remote.py:750 in register_workflow                 │
│                                                                                                                             │
│ ❱  750 │   │   ident = self._serialize_and_register(entity, serialization_settings, version, op                             │
│                                                                                                                             │
│ /Users/yuanwang/venvs/flyte/lib/python3.11/site-packages/flytekit/remote/remote.py:687 in _serialize_and_register           │
│                                                                                                                             │
│ ❱  687 │   │   │   │   │   f"No serialization settings set, but workflow contains entities that                             │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: 'TaskSpec' object has no attribute 'id'
s
version
needs to be specified as a parameter or in the serialization settings. We need to improve our flyte remote docs and docstrings. Contributions welcome. 🙂 I believe
serialization_settings
is a mandatory argument to specify.
@Kevin Su, would you mind helping Yuan if you've the time? I'll try coming up with an example tomorrow to share it with Yuan.
k
yes, need to add
SerializationSettings
Copy code
ss = SerializationSettings(image_config, project="flytesnacks", domain="development", version=version)
remote.register_workflow(hello_wf, serialization_settings=ss)
s
Hi @Yuan Wang (Mike), have you got it to work?
y
Copy code
(flyte) ➜  workflows git:(main) ✗ cat simple.py
import typing
from flytekit import task, workflow


@task
def say_hello(name: str) -> str:
    return f"hello {name}!"


@task
def greeting_length(greeting: str) -> int:
    return len(greeting)


@workflow
def wf(name: str = "union") -> typing.Tuple[str, int]:
    greeting = say_hello(name=name)
    greeting_len = greeting_length(greeting=greeting)
    return greeting, greeting_len


if __name__ == "__main__":
    from flytekit.remote import FlyteRemote
    from flytekit.configuration import Config, SerializationSettings, ImageConfig

    remote = FlyteRemote(
        config=Config.for_sandbox(),
        default_project="flytesnacks",
        default_domain="development"
    )

    ss = SerializationSettings(image_config=ImageConfig.auto_default_image(),
                               project="flytesnacks",
                               domain="development",
                               version="v1")

    remote_wf = remote.register_workflow(
        entity=wf,
        serialization_settings=ss
    )

    remote.execute(entity=remote_wf, inputs={"name": "mike"})
@Samhita Alla • Thanks for your reply and coordinating Kevin for helping me out. Really appreciated. • After adding the
SerializationSettings
object, it can register and execute the workflow. • However, I got error when running the first task of the workflow,
ValueError: Empty module name
• Is it correct to use
ImageConfig.auto_default_image()
to input the
image_config
? • If I want to specify a custom image, how should I do it with
ImageConfig
? It would be good if some examples can be created for the ImageConfig API • If I want to use
ImageSpec
to specify the image for task execution, do I still need to input a
image_config
here? • Generally speaking, is it easier to use
register_script
than
register_workflow
? From the function definition, it seems that
register_script
will take care of initializing
SerializationSettings
by default. • I would like to contribute to the document. How should I proceed with that?
s
1.
ImageConfig.auto_default_image()
looks correct to me 2. You can specify a custom image by sending
ImageConfig.auto(img_name="...")
3. @Kevin Su, does image spec work with flyte remote? 4. You can either use
register_script
or
register_workflow
. IMO,
register_workflow
is much easier to get started. 5. You can contribute to https://docs.flyte.org/projects/flytekit/en/latest/design/control_plane.html document, Yuan. You should just be able to fork the repo, modify or add the relevant content, and create a PR. You can tag me by leaving a comment in the PR. 🙂 Can you try importing the workflow in a different module and writing your flyte remote code over there? Something like:
Copy code
from workflows.example import wf

# flyte remote code
...
y
Copy code
(flyte) ➜  workflows git:(main) ✗ cat simple.py
import typing
from flytekit import task, workflow


@task
def say_hello(name: str) -> str:
    return f"hello {name}!"


@task
def greeting_length(greeting: str) -> int:
    return len(greeting)


@workflow
def wf(name: str = "union") -> typing.Tuple[str, int]:
    greeting = say_hello(name=name)
    greeting_len = greeting_length(greeting=greeting)
    return greeting, greeting_len
(flyte) ➜  workflows git:(main) ✗ cat flyte_remote.py
from flytekit.remote import FlyteRemote
from flytekit.configuration import Config, SerializationSettings, ImageConfig

from simple import wf


remote = FlyteRemote(
    config=Config.for_sandbox(),
    default_project="flytesnacks",
    default_domain="development"
)

ss = SerializationSettings(
    image_config=ImageConfig.auto_default_image(),
    project="flytesnacks",
    domain="development",
    version="v1")

remote_wf = remote.register_workflow(
    entity=wf,
    serialization_settings=ss
)

remote.execute(
    entity=remote_wf,
    inputs={"name": "mike"}
)
(flyte) ➜  workflows git:(main) ✗ python flyte_remote.py
@Samhita Alla • Got another kind of error
s
Sorry. Can you share with me the directory structure?
y
Yes
Copy code
(flyte) ➜  workflows git:(main) ✗ tree
.
├── __init__.py
├── flyte_remote.py
├── simple.py
s
Can you move workflows to a different folder? parent -- workflows -- init.py -- flyte_remote.py -- simple.py
y
And what should be the next step? Still run
python flyte_remote.py
within
workflows
folder?
s
Yeah, can you give that a try?
y
I moved it to one level up, and I got the same error
s
Okay. Can you try running the script in the parent directory or one level above that?
y
Got the same error if I run
python workflows/flyte_remote.py
in the parent directory. Can you reproduce the issue on your laptop?
s
I'm not sure how to resolve this issue. @Kevin Su, can you help please?
The error is caused because we are running the Flyte remote script with Python (not pyflyte), which imports and triggers the workflow. As a result, Flyte is unable to find the parent directory of that workflow.
register_workflow
has to work but we may need to do something to get it to work. I'll try talking to the team to get it fixed.
y
Copy code
➜  workflows git:(main) ✗ cat flyte_remote_register_script.py
from flytekit.remote import FlyteRemote
from flytekit.configuration import Config, SerializationSettings, ImageConfig

from simple import wf


remote = FlyteRemote(
    config=Config.for_sandbox(),
    default_project="flytesnacks",
    default_domain="development"
)

# ss = SerializationSettings(
    # image_config=ImageConfig.auto_default_image(),
    # project="flytesnacks",
    # domain="development",
    # version="v1")

remote_wf = remote.register_script(
    entity=wf,
    image_config=ImageConfig.auto_default_image(),
    version="v1",
    source_path="../",
    # module_name="remote_workflow"
)

remote.execute(
    entity=remote_wf,
    inputs={"name": "mike"}
)

➜  workflows git:(main) ✗ python flyte_remote_register_script.py
╭────────────────── Traceback (most recent call last) ───────────────────╮
│ /Users/yuanwang/projects/flyte/test_case/workflows/flyte_remote_regist │
│ er_script.py:19 in <module>                                            │
│                                                                        │
│ ❱ 19 remote_wf = remote.register_script(                               │
│                                                                        │
│ /opt/homebrew/lib/python3.11/site-packages/flytekit/remote/remote.py:8 │
│ 85 in register_script                                                  │
│                                                                        │
│ ❱  885 │   │   │   │   compress_scripts(source_path, str(archive_fname │
│                                                                        │
│ /opt/homebrew/lib/python3.11/site-packages/flytekit/tools/script_mode. │
│ py:48 in compress_scripts                                              │
│                                                                        │
│ ❱  48 │   │   copy_module_to_destination(source_path, destination_path │
│                                                                        │
│ /opt/homebrew/lib/python3.11/site-packages/flytekit/tools/script_mode. │
│ py:64 in copy_module_to_destination                                    │
│                                                                        │
│ ❱  64 │   mod = importlib.import_module(module_name)                   │
│                                                                        │
│ /opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/ │
│ Versions/3.11/lib/python3.11/importlib/__init__.py:117 in              │
│ import_module                                                          │
│                                                                        │
│ ❱ 117 │   if name.startswith('.'):                                     │
╰────────────────────────────────────────────────────────────────────────╯
AttributeError: 'NoneType' object has no attribute 'startswith'
If I remove the comment symbol
#
in front of the
module_name
parameter of
register_script
, I got another error.
Copy code
➜  workflows git:(main) ✗ python flyte_remote_register_script.py
╭────────────────── Traceback (most recent call last) ───────────────────╮
│ /Users/yuanwang/projects/flyte/test_case/workflows/flyte_remote_regist │
│ er_script.py:19 in <module>                                            │
│                                                                        │
│ ❱ 19 remote_wf = remote.register_script(                               │
│                                                                        │
│ /opt/homebrew/lib/python3.11/site-packages/flytekit/remote/remote.py:8 │
│ 85 in register_script                                                  │
│                                                                        │
│ ❱  885 │   │   │   │   compress_scripts(source_path, str(archive_fname │
│                                                                        │
│ /opt/homebrew/lib/python3.11/site-packages/flytekit/tools/script_mode. │
│ py:48 in compress_scripts                                              │
│                                                                        │
│ ❱  48 │   │   copy_module_to_destination(source_path, destination_path │
│                                                                        │
│ /opt/homebrew/lib/python3.11/site-packages/flytekit/tools/script_mode. │
│ py:64 in copy_module_to_destination                                    │
│                                                                        │
│ ❱  64 │   mod = importlib.import_module(module_name)                   │
│                                                                        │
│ /opt/homebrew/Cellar/python@3.11/3.11.4_1/Frameworks/Python.framework/ │
│ Versions/3.11/lib/python3.11/importlib/__init__.py:126 in              │
│ import_module                                                          │
│                                                                        │
│ ❱ 126 │   return _bootstrap._gcd_import(name[level:], package, level)  │
│ in _gcd_import:1204                                                    │
│ in _find_and_load:1176                                                 │
│ in _find_and_load_unlocked:1140                                        │
╰────────────────────────────────────────────────────────────────────────╯
ModuleNotFoundError: No module named 'remote_workflow'
k
Sorry, I missed the message. I’ll take a look tomorrow morning
s
@Yuan Wang (Mike), I think the module needs to be
simple
because you have your
wf
in
simple.py
, right?
Also, here's the directory structure of the example I sent, just in case you still get a module not found error: https://github.com/flyteorg/flytesnacks/tree/master/examples/feast_integration/feast_integration.
g
Here’s my code, it’s working fine
Copy code
root/
   flyte_remote.py
   workflows/
       __init__.py
       simple.py
Added copy_all=True in register_script
Copy code
remote_wf = remote.register_script(entity=wf, copy_all=True, version="...", image_config=ImageConfig.auto_default_image(), source_path=".")
executed the following cmd inside root, it worked!
Copy code
python flyte_remote.py
y
Thanks, @Gaurav Kumar. Your solution also works for me. @Samhita Alla • So it seems that we have to move the file that contains
register_script()
call to the parent directory of the workflows files, and specify
copy_all=True
and
source_path="."
• I tried the similar approach with
register_workflow
. However, I got the error
ModuleNotFoundError: No module named 'workflows'
in the flyte console. This seems to me something related with
copy_all
and
source_path
. • So the question seems to be mainly about the file structure. Another issue is the mandatory/optional parameters of these registration functions. It would be great if you guys can provide a clear description of these two questions. Thanks a lot.
Copy code
➜  test_case git:(main) ✗ tree
.
├── Dockerfile
├── LICENSE
├── README.md
├── docker_build.sh
├── flyte_remote.py
├── flyte_remote_register_script.py
├── requirements.txt
└── workflows
    ├── __init__.py
    ├── simple.py
Copy code
➜  test_case git:(main) ✗ cat flyte_remote.py
from flytekit.remote import FlyteRemote
from flytekit.configuration import Config, SerializationSettings, ImageConfig

from workflows.simple import wf


remote = FlyteRemote(
    config=Config.for_sandbox(),
    default_project="flytesnacks",
    default_domain="development"
)

ss = SerializationSettings(
    image_config=ImageConfig.auto_default_image(),
    project="flytesnacks",
    domain="development",
    version="v3")

remote_wf = remote.register_workflow(
    entity=wf,
    serialization_settings=ss
)

remote.execute(
    entity=remote_wf,
    inputs={"name": "mike"}
)
Copy code
➜  test_case git:(main) ✗ cat flyte_remote_register_script.py
from flytekit.remote import FlyteRemote
from flytekit.configuration import Config, ImageConfig

from workflows.simple import wf


remote = FlyteRemote(
    config=Config.for_sandbox(),
    default_project="flytesnacks",
    default_domain="development"
)

remote_wf = remote.register_script(
    entity=wf,
    image_config=ImageConfig.auto_default_image(),
    version="v2",
    copy_all=True,
    source_path=".",
    # module_name="workflows"
)

remote.execute(
    entity=remote_wf,
    inputs={"name": "mike"}
)
Copy code
➜  test_case git:(main) ✗ cat workflows/simple.py
import typing
from flytekit import task, workflow


@task
def say_hello(name: str) -> str:
    return f"hello {name}!"


@task
def greeting_length(greeting: str) -> int:
    return len(greeting)


@workflow
def wf(name: str = "union") -> typing.Tuple[str, int]:
    greeting = say_hello(name=name)
    greeting_len = greeting_length(greeting=greeting)
    return greeting, greeting_len
s
Good to know that you got
register_script
to work, Yuan. Thanks, Gaurav! We'll work on improving our docs and also look into the
register_workflow
method.