Hi all, building my first more complex flyte workflow and I have a couple of questions on best pract...
e

Ena Škopelja

over 2 years ago
Hi all, building my first more complex flyte workflow and I have a couple of questions on best practices. I have a workflow defined in
project_1
that includes a dynamic workflow. The dynamic workflow starts a launchplan defined in
project_2
(see attached sketch). This is my code structure:
|- project_1 /
|  |- pyproject.toml
|  |- project_1 /
|  |  |- __init__.py
|  |  |- workflows.py
|  |  |- tasks.py
|  |  |- models.py
|  |  |- ...
|- project_2 /
|  |- pyproject.toml
|  |- project_2 /
|  |  |- __init__.py
|  |  |- workflows.py
|  |  |- tasks.py
|  |  |- models.py
|  |  |- ...
|- Dockerfile
|- docker_build.sh
What I'm doing right now: • add
project_2
as a dependency to project one • build a docker image with
project_2
installed as a dependency to project 1 • package + register • run This approach requires rebuilding the docker image with every execution because fast register will not realise that the launchplan defined in
project_2
should have a different version than the rest of the workflow and it fails while trying to fetch the workflow. Another option would be to do something like:
remote = FlyteRemote(config=Config.auto())
launchplan = remote.fetch_launch_plan(...)
but that would require my pod to be able to reach flyteadmin because it's inside a dynamic workflow, Is there anything I can do to avoid rebuilding the docker image for every run? Am I doing something wrong? Any advice is much appreciated. Another issue is that the way I run things now makes the UI fail to expand the dynamic workflow and doesn't provide any link to the newly spawned workflows so I can't reliably track the intermediate states or see why something failed (if it did). (see screenshot, the purple task is dynamic). It also leads to all sorts of mixed status reports (see other screenshot).
Hi all! Could you please help me - can i use PythonPickledFile between raw containers? Right now i ...
i

illarion Disabled

over 3 years ago
Hi all! Could you please help me - can i use PythonPickledFile between raw containers? Right now i am trying to use this data type with method described here https://docs.flyte.org/projects/cookbook/en/latest/auto/core/containerization/raw_container.html and getting
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
when trying to read them by pickle.load(file): inside
with open(f'{input_dir}/dump_file') as file:
        raw_data = pickle.load(file)
it is saved without errors by:
with open(f'{output_dir}/dump_file', 'wb') as handle:
        pickle.dump(raw_data, handle)
by previous stage raw container
stage1 = ContainerTask(
    name="stage1",
    output_data_dir="/var/outputs",
    outputs=kwtypes(generation=PythonPickledFile),
    image="<http://registry.project.com/flyte/stage1:0.0.1|registry.project.com/flyte/stage1:0.0.1>",
    command=[
        "python",
        "stage1.py",
        "/var/outputs",
    ],
)

stage2 = ContainerTask(
    name="stage2",
    input_data_dir="/var/inputs",
    output_data_dir="/var/outputs",
    inputs=kwtypes(dump_file=PythonPickledFile),
    outputs=kwtypes(stage2_out=str),
    image="<http://registry.project.com/flyte/stage2:0.0.1|registry.project.com/flyte/stage2:0.0.1>",
    command=[
        "python",
        "stage2.py",
        "/var/inputs",
        "/var/outputs",
    ],
)

@workflow
def my_wf() -> str:
    generation = stage1()
    return stage2(dump_file=generation)
Can i pass PythonPickledFile like that?