mammoth-parrot-74806
01/16/2025, 2:55 PMpipelines/flyte.py.
The workflows and tasks depend on custom python modules (like steps) as well as reading other files like configuration templates in config/project_conf.yml.
I was able to register my workflow and get it to work properly using from the src directory: pyflyte register steps pipelines.
I would like to do this programmatically using FlyteRemote. I have been able to register my workflow using register_workflow(entity, version, project, domain), but it tells me problems finding files (like config/project_conf.yml itself). I think it may be related to the root that it uses by default when registering it this way, but I'm not sure how to continue to go this way. The register_workflow function is called when running the main.py file. Do you know where I am failing and how could I fix it? Let me know if further explanation or details are needed to fully understand the issue and thanks a lot in advance 😄average-finland-92144
01/16/2025, 3:06 PM.register_script root by including `source_path`https://docs.flyte.org/en/latest/api/flytekit/generated/flytekit.remote.remote.FlyteRemote.html#flytekit.remote.remote.FlyteRemote.register_script
From the docs, seems like copy_all is deprecated but then you should be able to use something like `fast_package_options= {copy_style=copy}`(reference) to make it go recursively through the contents of the foldermammoth-parrot-74806
01/16/2025, 4:25 PMremote = FlyteRemote(config=Config.for_sandbox(), default_project='flytesnacks', default_domain='development')
registered_script = remote.register_script(
entity=inference_workflow,
version="1.0.0",
source_path=".",
fast_package_options=FastPackageOptions([], copy_style="copy", show_files="show_files")
)
😄average-finland-92144
01/16/2025, 4:43 PMmammoth-parrot-74806
01/17/2025, 10:07 AMremote module. For this, I am using this code:
from flytekit import LaunchPlan
from src.pipelines.flyte import inference_workflow
from flytekit.remote import FlyteRemote
from flytekit.configuration import Config
remote = FlyteRemote(config=Config.for_sandbox(), default_project='flytesnacks', default_domain='development')
cron_lp_every_five_minutes = LaunchPlan.get_or_create(
name="scheduled_lp",
workflow=inference_workflow,
schedule=CronSchedule(schedule="*/5 * * * *"), # every 5 minutes
)
registered_launchplan = remote.register_launch_plan(
entity=cron_lp_every_five_minutes,
version="4.0.0"
)
As far as I have seen in the documentation, it seems to be correct, but I get an error when registering (even including the name field):
Users/me/python3.11/site-packages/flytekit/core/tracker.py:337 in _task_module_from_callable
AttributeError: 'LaunchPlan' object has no attribute '__name__'
I have tried several configurations but I can't find the problem. Any suggestions that I can try?
Thanks again in advance!average-finland-92144
01/17/2025, 7:39 PMmammoth-parrot-74806
01/20/2025, 9:30 AMregister_script) to which I want to assign a new LaunchPlan in order to schedule it when I need it.
It seems that by default the register_launch_plan is trying to re-register my workflow, but that is not the goal, but to modify an existing one. Despite using the same name and version of the already deployed workflow, the error persists.
Is there a solution or alternative to schedule existing workflows or is it an ongoing issue?mammoth-parrot-74806
01/20/2025, 10:08 AMflytekit. I've managed to register the launchplan by this way:
cron_lp_every_five_minutes = LaunchPlan.get_or_create(
name="scheduled_lp",
workflow=inference_workflow,
schedule=CronSchedule(schedule="*/5 * * * *"), # every 5 minutes
default_inputs={}
)
remote.register_launch_plan(entity=cron_lp_every_five_minutes, version="4.0.0")
The key thing was to use the same version for the LaunchPlan as the one used when registering the Workflow with the register_script function. Just in case it is useful for anyone else 🙌
Thanks again for your help David!mammoth-parrot-74806
01/22/2025, 12:10 PMflyte-metadata-bucket.
3. A new policy must be created to be able to access S3 as indicated here. Should the allowed S3 bucket the flyte-metadata-bucket or a new one that must be created to store outputs from my workflows?
4. Regarding the Roles, for both the flyte-system-role and the flyte-workers-role, apart from the sts:AssumeRoleWithWebIdentity permissions, the policy created in 3. should be also attached the policy created in 3., or as the bucket is related with metadata it is only needed in the flyte-system-role?
5. Finally, regarding the Helm chart, would it be the one in charge of creating the aforementioned Service Accounts for both the flyte-system-role and the flyte-workers-role or should I create them by myself? I have not clear as the documentations says that the Helm will take care of it but the commands mentioned there are also creating the Service Account 🤔
I am doing the creation of every resource using Terraform, I saw the tf template for the deployment but it just exists for the flyte-core deployment, not for the flyte-binary one and a few things like naming and needed resources defer between the guide and the template's code (even comparing it with the code under the flyte-binary tf template in this branch).
Thanks a lot in advance, step by step I am nearer to have my first ML pipeline running in production! 😄