Hey i have a question regarding the max_parallelism option for running a workflow remotely. Even tou...

faint-activity-87590

05/30/2023, 4:29 PM

Hey i have a question regarding the max_parallelism option for running a workflow remotely. Even tough i configured max parellism it seems to always spin up 20 pods at a time. Anyone know whats going wrong here?

average-finland-92144

05/30/2023, 6:09 PM

@faint-activity-87590 is this a map task?

faint-activity-87590

05/30/2023, 6:11 PM

Hey David 🙂 No its a regular task within a dynamic subworkflow

average-finland-92144

05/30/2023, 6:14 PM

and are you setting up

max_parallellism

in the launch plan?

faint-activity-87590

05/30/2023, 6:16 PM

Copy code

LaunchPlan.get_or_create(
    name="soft-energy-launchplan",
    workflow=wf,
    max_parallelism=10,
    notifications=[
        Slack(
            phases=[
                WorkflowExecutionPhase.SUCCEEDED,
                WorkflowExecutionPhase.FAILED,
                WorkflowExecutionPhase.ABORTED,
                WorkflowExecutionPhase.TIMED_OUT,
            ],
            recipients_email=["does-not-matter-what-you-put-in-here"],
        ),
    ],
)

Yes like this and it made no difference. For some reason registering this lp and exeuting it doesnt even show me the parallelism value in the ui but 0, but thats another problem 😛

tall-lock-23197

05/31/2023, 5:30 AM

I think

max_parallelism

value isn't being respected for tasks in a dynamic workflow. It should, however, work correctly for tasks in a simple workflow. @high-accountant-32689, is this something we need to support?

faint-activity-87590

05/31/2023, 6:17 AM

Ah okay this changes a lot! So for tasks in a dynamic workflow, flyteadmins configuration for

max_parallelism

is being respected? Would be really nice to control this like in a normal workflow

faint-activity-87590

05/31/2023, 9:02 AM

Okay interesting outcome of an experiment i just did. This is my simple workflow:

Copy code

from flytekit import workflow, task, dynamic, LaunchPlan, Resources, FixedRate
import time
from datetime import timedelta


@task(requests=Resources(cpu="500m", mem="500Mi"))
def say_hello(input: str):
    time.sleep(60)
    print(f"Hello {input}")


@workflow
def wf():
    counter = 50
    for i in range(counter):
        say_hello(input="World")


LaunchPlan.get_or_create(
    name="p-launchplan",
    workflow=wf,
    max_parallelism=2,
    schedule=FixedRate(duration=timedelta(minutes=3)),
)

Executing this and observing the pods in kubernetes showed that Flyte is only launching 2 pods at a time but does not wait until the pods are done. So it basically increments with 2 over time and still ends up with 50 running pods at some time. I really thought

max_parallelism

reflects the amount of RUNNING pods at a time. Is this intended or am i missing something?

tall-lock-23197

06/02/2023, 5:17 PM

I really thought
max_parallelism
reflects the amount of RUNNING pods at a time.

Yeah, I think it should. How's that workflow working for you? You cannot use loop in a Flyte workflow.

millions-night-34157

07/05/2023, 3:19 PM

@faint-activity-87590 Which command are you using to register this launch plan? In my case

flytectl register files

command is trying to execute the launch plan from the local environment itself.

faint-activity-87590

07/05/2023, 3:28 PM

I think i always used pyflyte register

🙏 1

166 Views

Open in Slack

Previous Next

Flyte

Flyte enables production-grade orchestration for machine learning workflows and data processing created to accelerate local workflows to production.