Hi all! Can you please advise me an example with h...
# announcements
i
Hi all! Can you please advise me an example with hello world workflow on Flyte where at least two tasks executed on their own containers? With single python workflow code on flyte/workflow/example.py and such task option:
Copy code
@task(container_image="<http://registry.name.com/project/image:tag|registry.name.com/project/image:tag>")
I am trying to build this example by myself on my local Flyte sandbox, but getting error messages:
Copy code
ModuleNotFoundError: No module named 'flyte'
My workflow from example.py
Copy code
@task(container_image="<http://registry.name.com/project/image:tag|registry.name.com/project/image:tag>")
def some_data_generation() -> PythonPickledFile:
    with open(BASE_FILE_PATH) as file:
        some_descriptors = json.load(file)

    some_set = generate_some_set(some_descriptors[0])
    with open(PICKLE_PATH, 'wb') as handle:
        pickle.dump(some_set, handle)
        return PICKLE_PATH

@task(container_image="<http://registry.name.com/project/image2:tag|registry.name.com/project/image2:tag>")
def load_pickle_dump(dump_file_path: PythonPickledFile) -> set:
    with open(dump_file_path, 'rb') as handle:
        return pickle.load(handle)

@workflow
def my_wf() -> set:
    dump_file_path = some_data_generation()
    return load_pickle_dump(dump_file_path=dump_file_path)

if __name__ == "__main__":
    a = my_wf()
    print(f"Running my_wf() {a}")
What was localised: 1. Workflow works fine without containers (without @task(container_image= ) 2. Images works fine if workflow contain only one task with Docker image where included Flyte Workflow folder with this file (example.py) 3. Problem appears if i build Docker Image for first task without Flyte workflow files inside (but with initial data for first files to skip downloading). 4. I am sure - i can skip second task at all from this test case, problem should appear. Root cause - i have Docker image without Flute workflow (example.py) but it seems that this code is required inside the container to be executed. I do not understand how can i split example.py between two tasks if it should be executed actually outside the tasks (because this is a workflow, if should contain tasks inside it according to example) Error:
Copy code
[1/1] currentAttempt done. Last Error: USER::Pod failed. No message received from kubernetes.
[j3nl8bafcr-n0-0] terminated with exit code (1). Reason [Error]. Message: 
thon3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/opt/venv/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/opt/venv/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/venv/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/opt/venv/lib/python3.8/site-packages/flytekit/bin/entrypoint.py", line 460, in execute_task_cmd
    _execute_task(
  File "/opt/venv/lib/python3.8/site-packages/flytekit/exceptions/scopes.py", line 160, in system_entry_point
    return wrapped(*args, **kwargs)
  File "/opt/venv/lib/python3.8/site-packages/flytekit/bin/entrypoint.py", line 327, in _execute_task
    _task_def = resolver_obj.load_task(loader_args=resolver_args)
  File "/opt/venv/lib/python3.8/site-packages/flytekit/core/python_auto_container.py", line 189, in load_task
    task_module = importlib.import_module(task_module)
  File "/usr/local/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 961, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 973, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'flyte'
As i see - it caused by Docker Container without a Flyte workflow. But if i will place workflow in first task - how it should be correctly splitted between two tasks? Should not include code for first task inside the second task image/container? Am i on correct way at all? Can you please advise a simple short workflow example on Python with two tasks on separate containers?
k
Hi @illarion Disabled firstly welcome to the Flyte community. let me read through and answer the question in a little bit
๐Ÿ‘ 1
i
k
Sorry busy with kids will take time to reply. Saturday mornings. Unless anyone else can @Kevin Su ?
k
Looking
@illarion Disabled you dockerfile should locate at the same level with flyte directory. For example,
Copy code
myflyteapp
โ”œโ”€โ”€ Dockerfile
โ”œโ”€โ”€ docker_build_and_tag.sh
โ”œโ”€โ”€ flyte
โ”‚         โ”œโ”€โ”€ __init__.py
โ”‚         โ””โ”€โ”€ workflows
โ”‚             โ”œโ”€โ”€ __init__.py
โ”‚             โ””โ”€โ”€ example.py
โ””โ”€โ”€ requirements.txt
i
@Kevin Su But how it will look in case of two tasks in different docker containers?
root example.py: @workflow def my_wf(): a = task1() //in container1 according to args in task decorator b = task2(a) //in container2 according to args in task decorator
How should looks example.py content in container1 and container2?
k
how it should be correctly splitted between two tasks? Should not include code for first task inside the second task image/container?
In most of time, tasks will use same image, which means your image will contain code for first task and second task. although we use same image, we will run those tasks in different container (or pods in k8s). In each container, flyte will only execute a function (your task). your donโ€™t need two different dockerfile, one for first task, and another on for second task. The tasks and workflow should be in a single Docker file.
๐Ÿ˜ฎ 1
or do you want those tasks run in different container and different image? By default, we will run the tasks in different container but same image.
you can also use different image for those tasks, but you dockerfile should also locate in the same level with your module.
Copy code
myflyteapp
โ”œโ”€โ”€ Dockerfile1
|โ”€โ”€ Dockerfile2
โ”œโ”€โ”€ docker_build_and_tag.sh
โ”œโ”€โ”€ flyte
โ”‚         โ”œโ”€โ”€ __init__.py
โ”‚         โ””โ”€โ”€ workflows
โ”‚             โ”œโ”€โ”€ __init__.py
โ”‚             โ””โ”€โ”€ example.py
โ””โ”€โ”€ requirements.txt
๐Ÿ‘ 1
i
@Kevin Su Thank you for the answer! Yes, i thought that Flute way allows me to prepare different docker images for different tasks, because in my case they will be very different, probably for some of them i will need different OS at all, some of them should contain TensorFlow, but some of them expected to be tiny to execute simple bash script
Thank you very much!
k
yeah, in that case, itโ€™s better to use different image. you can try above method, let me know if you have any problem.
๐Ÿ‘ 1
k
You can build multiple images and pass it in the @task container_image field
Let me share docs
๐Ÿ‘ 1
I was going to write an example too
I will be able to write an example only later today or tomorrow
In the package command you can pass the multiple images
Or use config
There is a better syntax coming in 2 weeks
๐Ÿ‘ 1
i
To have Example in docs would be perfect! For my opinion that is a second step after a hello world - split tasks by different containers
For my personal - i do not clearly understand how my tasks will be isolated and executed inside the each container separately in case if all complete workflow will be passed in each As Is - looks like the same workflow will be executed twice in each container (for my non-professional vision). I guess i am wrong in this vision.
k
in each container, we will only execute a task instead of workflow. Take above code example, instead of running the command
python example.py
in the container, we will run
pyflyte-execute --task-module flyte.workflow.load_pickle_dump
. Therefore, we can only execute a task in each container
๐Ÿ‘ 1
i
Now it is clear for me! Thank you very much!
๐Ÿ‘ 1
k
@illarion Disabled maybe I understood you wrong. You do not need 2 containers. You can use the same Container for all tasks as @Kevin Su said. You need 2 containers / 2 dockerfiles only if you have different requirements per task. For example - one is gpu another one is non gpu, one is used for spark or horovod based distributed training. Even in these cases Flyte allows you to use the same container
So if you look at all Examples with more than one task
They use the same Container
Just follow the idea and let Flyte figure out how to run your functions
i
@Ketan (kumare3) For this demonstration i simplified the code, now it looks like there is enough to have the same container. But in practice my tasks is quite different and would be better to have different containers for them. For me better to split requirements.txt for each task it will be differ.
k
@illarion Disabled got it, but also you do not need too many containers, for small differences
Much better to reuse the containers
That optimizes startup time too
673 Views