Hi everyone wave I ve been testing Flyte for a few weeks usi Flyte #flyte-support

Hi everyone! :wave: I've been testing Flyte for a...

mammoth-parrot-74806

01/16/2025, 2:55 PM

Hi everyone! 👋 I've been testing Flyte for a few weeks using the sandbox locally before deploying it productively on EKS. I am working on a template that facilitates the creation of workflows and their registration in Flyte (I attach the directory architecture in the image to give the full context). The file in which the tasks and workflows are instantiated resides in

pipelines/flyte.py.

The workflows and tasks depend on custom python modules (like

steps

) as well as reading other files like configuration templates in

config/project_conf.yml

. I was able to register my workflow and get it to work properly using from the

src

directory:

pyflyte register steps pipelines

. I would like to do this programmatically using

FlyteRemote

. I have been able to register my workflow using

register_workflow(entity, version, project, domain)

, but it tells me problems finding files (like

config/project_conf.yml

itself). I think it may be related to the root that it uses by default when registering it this way, but I'm not sure how to continue to go this way. The

register_workflow

function is called when running the

main.py

file. Do you know where I am failing and how could I fix it? Let me know if further explanation or details are needed to fully understand the issue and thanks a lot in advance 😄

average-finland-92144

01/16/2025, 3:06 PM

Hey Hugo, welcome You can set the

.register_script

root by including `source_path`https://docs.flyte.org/en/latest/api/flytekit/generated/flytekit.remote.remote.FlyteRemote.html#flytekit.remote.remote.FlyteRemote.register_script From the docs, seems like

copy_all

is deprecated but then you should be able to use something like `fast_package_options= {copy_style=copy}`(reference) to make it go recursively through the contents of the folder

mammoth-parrot-74806

01/16/2025, 4:25 PM

Awesome David, thanks a lot for the quick response! It worked using this configuration, if it could help anyone in the same situation:

Copy code

remote = FlyteRemote(config=Config.for_sandbox(), default_project='flytesnacks', default_domain='development')

registered_script = remote.register_script(
        entity=inference_workflow,
        version="1.0.0",
        source_path=".",
        fast_package_options=FastPackageOptions([], copy_style="copy", show_files="show_files")
)

😄

average-finland-92144

01/16/2025, 4:43 PM

awesome, thanks for sharing!

mammoth-parrot-74806

01/17/2025, 10:07 AM

Hello again! 👋 Following in the same line, I am now trying to register a Launch Plan using the Flyte

remote

module. For this, I am using this code:

Copy code

from flytekit import LaunchPlan
from src.pipelines.flyte import inference_workflow
from flytekit.remote import FlyteRemote
from flytekit.configuration import Config

remote = FlyteRemote(config=Config.for_sandbox(), default_project='flytesnacks', default_domain='development')

cron_lp_every_five_minutes = LaunchPlan.get_or_create(
        name="scheduled_lp",
        workflow=inference_workflow,
        schedule=CronSchedule(schedule="*/5 * * * *"), # every 5 minutes
    )

registered_launchplan = remote.register_launch_plan(
        entity=cron_lp_every_five_minutes,
        version="4.0.0"
    )

As far as I have seen in the documentation, it seems to be correct, but I get an error when registering (even including the

name

field):

Copy code

Users/me/python3.11/site-packages/flytekit/core/tracker.py:337 in _task_module_from_callable
AttributeError: 'LaunchPlan' object has no attribute '__name__'

I have tried several configurations but I can't find the problem. Any suggestions that I can try? Thanks again in advance!

average-finland-92144

01/17/2025, 7:39 PM

@mammoth-parrot-74806 you may seem to be hitting this issue? https://github.com/flyteorg/flyte/issues/6062

mammoth-parrot-74806

01/20/2025, 9:30 AM

Exactly @average-finland-92144, the case is the same as in the last message of the issue: a workflow previously registered using the previous code (

register_script

) to which I want to assign a new LaunchPlan in order to schedule it when I need it. It seems that by default the

register_launch_plan

is trying to re-register my workflow, but that is not the goal, but to modify an existing one. Despite using the same

name

and

version

of the already deployed workflow, the error persists. Is there a solution or alternative to schedule existing workflows or is it an ongoing issue?

mammoth-parrot-74806

01/20/2025, 10:08 AM

Okay, I just saw the PR was merged last week and it was enough to update

flytekit

. I've managed to register the launchplan by this way:

Copy code

cron_lp_every_five_minutes = LaunchPlan.get_or_create(
        name="scheduled_lp",
        workflow=inference_workflow,
        schedule=CronSchedule(schedule="*/5 * * * *"), # every 5 minutes
        default_inputs={}
    )
remote.register_launch_plan(entity=cron_lp_every_five_minutes, version="4.0.0")

The key thing was to use the same version for the

LaunchPlan

as the one used when registering the

Workflow

with the

register_script

function. Just in case it is useful for anyone else 🙌 Thanks again for your help David!

mammoth-parrot-74806

01/22/2025, 12:10 PM

Hi again! After testing the behaviour of Flyte locally I'm definitely into deploying it to my EKS cluster in AWS. I'm following this guide to deploy in the Single Cluster (simple mode), but I have a couple of doubts regarding the docs: 1. We already have an existing EKS Cluster, so I can omit the first steps creating the role for the cluster and pods. 2. I created a bucket in S3 for metadata purposes, let's call it

flyte-metadata-bucket

. 3. A new policy must be created to be able to access S3 as indicated here. Should the allowed S3 bucket the

flyte-metadata-bucket

or a new one that must be created to store outputs from my workflows? 4. Regarding the Roles, for both the

flyte-system-role

and the

flyte-workers-role

, apart from the

sts:AssumeRoleWithWebIdentity

permissions, the policy created in 3. should be also attached the policy created in 3., or as the bucket is related with metadata it is only needed in the

flyte-system-role

? 5. Finally, regarding the Helm chart, would it be the one in charge of creating the aforementioned Service Accounts for both the

flyte-system-role

and the

flyte-workers-role

or should I create them by myself? I have not clear as the documentations says that the Helm will take care of it but the commands mentioned there are also creating the Service Account 🤔 I am doing the creation of every resource using Terraform, I saw the tf template for the deployment but it just exists for the

flyte-core

deployment, not for the

flyte-binary

one and a few things like naming and needed resources defer between the guide and the template's code (even comparing it with the code under the

flyte-binary

tf template in this branch). Thanks a lot in advance, step by step I am nearer to have my first ML pipeline running in production! 😄

5 Views

Open in Slack

Previous Next