My subworkflow inside a flyte conditional receives inputs fr Flyte #flyte-support

My subworkflow inside a flyte conditional receives...

kind-kite-58745

06/19/2023, 9:14 AM

My subworkflow inside a flyte conditional receives inputs from these 5 tasks, but it doesn’t appear to depend on them on graph UI. I think that’s because the “conditional” itself doesn’t depend on them, but the workflow it calls does. Is it possible to configure dependencies for a conditional, or assign this dependency explicitly? This is mostly a visual issue, but I’m also getting an error locally which I suspect may be related

AttributeError: 'FlyteBranchNode' object has no attribute 'interface'

(it does not abort my run on flyte remote, so it is tolerable)

tall-lock-23197

06/19/2023, 10:23 AM

Would you mind sharing the code snippet? Also are you seeing that error when you run the

pyflyte run

command locally?

kind-kite-58745

06/19/2023, 10:23 AM

I run using flytekit’s register_script() and execute(), I will share the relevant code pieces

tall-lock-23197

06/19/2023, 10:24 AM

An issue has been created already to tackle this error: https://github.com/flyteorg/flyte/issues/3628

kind-kite-58745

06/19/2023, 10:25 AM

I am aware of this issue with wait, but the reason it has to wait in the first place is because the branch node is waiting for the other tasks to finish, so it can use their outputs, no?

kind-kite-58745

06/19/2023, 10:29 AM

Copy code

@workflow
def smplx_proccess_pipeline(
    ...
    ):

    run_fit = should_run_fit(...) #return boolean

    conditional("should_run_fit")
    .if_(run_fit.is_true())
    .then(shape_fit_flow(
        ... inputs ...
    ))
    .else_()
    .then(dummy_task())

where shape_fit_flow is a

@workflow

that has tasks that use these inputs I suspect that if the outputs were ready by the time the conditional happened this may not have happened, and the view on UI would be more logically correct (because the subworkflow does depend on these outputs but this is not reflected visually)

kind-kite-58745

06/19/2023, 10:34 AM

Another image to visually explain it If I could explicitly define dependencies for the conditional then it would probably wait until the inputs were ready and these sections here would not overlap, and probably there would be no issue with the branch node wait Or maybe there would still be issues, in that case I would have the dependencies on the task that decides the conditional (should_run_fit). Then the wait would be on the task and not a conditional and it should also work. But I don’t want to configure it as an input for this task because it takes a moment to download these files and it doesnt need them. I’d like to configure it to run after those tasks without relying on inputs/outputs Any way I still need to configure a dependency here

tall-lock-23197

06/19/2023, 10:58 AM

I am aware of this issue with wait, but the reason it has to wait in the first place is because the branch node is waiting for the other tasks to finish, so it can use their outputs, no?

FlyteRemote

wait

will wait for the whole execution to finish.

I suspect that if the outputs were ready by the time the conditional happened this may not have happened

That should be the case. Which flytekit version have you installed?

kind-kite-58745

06/19/2023, 2:29 PM

My flytekit is version 1.6.2 I had to add dynamic parallels later on in this subworkflow so I changed it from

@workflow

@dynamic

and the error went away update 2023/06/20: the error did not go away it was because of caching that i did not hit it. I updated my cache version and the outputs were not ready soon enough for the branch node, causing this error again

tall-lock-23197

06/20/2023, 4:21 AM

Oh interesting.

kind-kite-58745

06/21/2023, 9:09 AM

Solved using explicit dependencies with create_node:

Copy code

motion_capture_task = create_node(git_clone, repo_url=mo_cap_repo, branch=mo_cap_git_tag).with_overrides(name="Clone CGMocap")
    core_algo_task = create_node(git_clone, repo_url=core_algo_repo, branch=core_algo_git_tag).with_overrides(name="Clone CoreAlgo")
    evaluators_task = create_node(git_clone, repo_url=evaluators_repo, branch=evaluators_git_tag).with_overrides(name="Clone Evaluators")
    fitting_task = create_node(git_clone, repo_url=ts_data_repo, branch=fitting_git_tag).with_overrides(name="Clone Fitting")
    common_algo_task = create_node(git_clone, repo_url=common_algo_repo, branch=common_algo_tag).with_overrides(name="Clone CommonAlgo")

    motion_capture_repo = motion_capture_task.o0
    core_algo_repo = core_algo_task.o0
    evaluators_repo = evaluators_task.o0
    fitting_repo = fitting_task.o0
    common_algo_repo = common_algo_task.o0

    run_fit = create_node(should_run_fit,
        takes_path=parsed_session_info.expression_paths_concatenated,
        source_bucket_name=parsed_session_info.scan_bucket,
        data_version=fitting_git_tag,
        bucket_name=output_bucket_name,
        force_fit=force_run_fit,
    )

    motion_capture_task >> run_fit
    core_algo_task >> run_fit
    evaluators_task >> run_fit
    fitting_task >> run_fit
    common_algo_task >> run_fit


    run_fit_conditional = (conditional("should_run_fit").if_(run_fit.o0.is_true()
    ).then(shape_fit_flow(
        fitting_repo=fitting_repo,
        core_algo_repo=core_algo_repo,
        common_algo_repo=common_algo_repo,
        motion_capture_repo=motion_capture_repo,
        smplx_version=smplx_version,
        debug_fit=debug_fit,
        shape_input_path=parsed_session_info.shape_input_path,
        scan_bucket=parsed_session_info.scan_bucket,
        is_ipad=parsed_session_info.is_ipad,
        gender=parsed_session_info.gender,
        user_id=parsed_session_info.user_id,
        session_id=parsed_session_info.session_id,
        prefix_for_algo_bucket=parsed_session_info.prefix_for_algo_bucket,
        fitting_git_tag=fitting_git_tag,
        output_bucket_name=output_bucket_name,
        expression_paths=parsed_session_info.expression_paths,
    )).else_().then(skip()))

Hope it’s not too hacky. By setting the clone tasks to run before the task that decides the conditional’s boolean, using >>, the outputs are guaranteed to be ready in time, so the error does not happen and the graph view is more accurate to my expectations Thanks

tall-lock-23197

06/21/2023, 9:57 AM

Nice. You should be able to chain tasks without using

create_node

. https://docs.flyte.org/projects/cookbook/en/latest/auto/core/control_flow/chain_entities.html

kind-kite-58745

06/21/2023, 10:13 AM

This makes everything so much simpler:

Copy code

run_fit = should_run_fit(
        takes_path=parsed_session_info.expression_paths_concatenated,
        source_bucket_name=parsed_session_info.scan_bucket,
        data_version=fitting_git_tag,
        bucket_name=output_bucket_name,
        force_fit=force_run_fit,
    )

    (motion_capture_repo := git_clone(repo_url=mo_cap_repo, branch=mo_cap_git_tag).with_overrides(name="Clone CGMocap")) >> run_fit
    (core_algo_repo := git_clone(repo_url=core_algo_repo, branch=core_algo_git_tag).with_overrides(name="Clone CoreAlgo")) >> run_fit
    (evaluators_repo := git_clone(repo_url=evaluators_repo, branch=evaluators_git_tag).with_overrides(name="Clone Evaluators")) >> run_fit
    (fitting_repo := git_clone(repo_url=ts_data_repo, branch=fitting_git_tag).with_overrides(name="Clone Fitting")) >> run_fit
    (common_algo_repo := git_clone(repo_url=common_algo_repo, branch=common_algo_tag).with_overrides(name="Clone CommonAlgo")) >> run_fit

    conditional("should_run_fit").if_(run_fit.is_true()
    ).then(shape_fit_flow(
        fitting_repo=fitting_repo,
        core_algo_repo=core_algo_repo,
        common_algo_repo=common_algo_repo,
        motion_capture_repo=motion_capture_repo,
        smplx_version=smplx_version,
        debug_fit=debug_fit,
        shape_input_path=parsed_session_info.shape_input_path,
        scan_bucket=parsed_session_info.scan_bucket,
        is_ipad=parsed_session_info.is_ipad,
        gender=parsed_session_info.gender,
        user_id=parsed_session_info.user_id,
        session_id=parsed_session_info.session_id,
        prefix_for_algo_bucket=parsed_session_info.prefix_for_algo_bucket,
        fitting_git_tag=fitting_git_tag,
        output_bucket_name=output_bucket_name,
        expression_paths=parsed_session_info.expression_paths,
    )).else_().then(skip())

Works well Thanks for the tip!

✨ 1

248 Views

Open in Slack

Previous Next