Hello, I have a problem with using FlyteRemote to register my workflow where Flyte complains that th...
e

Erik Dao

over 2 years ago
Hello, I have a problem with using FlyteRemote to register my workflow where Flyte complains that there is no nodes in my workflow. My codebase is on a Jupyter Notebook. The structure of my code is as follow:
my_package
|-- __init__.py
|-- data.py
|-- workflow.py
my_notebook.ipynb
My
workflow.py
is basically like this
import pandas as pd
from flyte import task, workflow

from data import generate_data, normalize_data


@task
def load_data() -> pd.DataFrame:
    return generate_data()

@task
def preprocess_data(data: pd.DataFrame) -> pd.DataFrame:
    return normalize_data(data)

@workflow
def simple_workflow():
    data = load_data()
    preprocess_data(data)
In my notebook, I first add the path to my local package to my system path, then create a FlyteRemote instance and try to register the workflow
import os
import sys
sys.path.append(os.getcwd())
sys.path.append(os.path.join(os.getcwd(), "my_package"))

from flytekit.remote import FlyteRemote
from flytekit.configuration import Config, PlatformConfig, ImageConfig, SerializationSettings
from flytekit.configuration import DataConfig, S3Config

remote = FlyteRemote(
    config=Config(
        platform=PlatformConfig(
            endpoint=f"dns:///{os.environ['FLYTE_ENDPOINT']}",
            insecure=True,
            insecure_skip_verify=True,
        ),
        data_config=DataConfig(s3=S3Config(
            endpoint=os.environ['AWS_S3_ENDPOINT'],
            enable_debug=True,
            access_key_id=os.environ['AWS_ACCESS_KEY_ID'],
            secret_access_key=os.environ['AWS_SECRET_ACCESS_KEY']
        ))
    ),
    default_project="my_project",
    default_domain="development",
    data_upload_location=os.environ['FLYTE_S3_BUCKET'],
)

from my_package.workflow import simple_workflow

flyte_workflow = remote.register_script(
    simple_workflow,
    image_config=ImageConfig.auto_default_image(),
    version="v1",
    module_name="my_package",
    source_path="./"
)

remote.execute(flyte_workflow, inputs={})
The error I've been facing is
_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.INTERNAL
        details = "failed to compile workflow for [resource_type:WORKFLOW 
project:"7b33c23287ae4d2481160448171b307c" domain:"development" 
name:"my_package.worklow.simple_workflow" version:"v1" ] with err failed to compile workflow with 
err Collected Errors: 1
        Error 0: Code: NoNodesFound, Node Id: resource_type:WORKFLOW project:"7b33c23287ae4d2481160448171b307c" 
domain:"development" name:""my_package.worklow.simple_workflow" version:"v1" , Description: Can't 
find any nodes in workflow [resource_type:WORKFLOW project:"7b33c23287ae4d2481160448171b307c" domain:"development" 
name:""my_package.worklow.simple_workflow" version:"v1" ].
"
        debug_error_string = "UNKNOWN:Error received from peer  {grpc_message:"failed to compile workflow for 
[resource_type:WORKFLOW project:\"7b33c23287ae4d2481160448171b307c\" domain:\"development\" 
name:\""my_package.worklow.simple_workflow\" version:\"v1\" ] with err failed to compile workflow 
with err Collected Errors: 1\n\tError 0: Code: NoNodesFound, Node Id: resource_type:WORKFLOW 
project:\"7b33c23287ae4d2481160448171b307c\" domain:\"development\" 
name:\""my_package.worklow.simple_workflow\" version:\"v1\" , Description: Can\'t find any nodes in 
workflow [resource_type:WORKFLOW project:\"7b33c23287ae4d2481160448171b307c\" domain:\"development\" 
name:\"my_package.worklow.simple_workflow\" version:\"v1\" ].\n", grpc_status:13, 
created_time:"2023-06-08T10:45:49.81363907+00:00"}"
Any idea on the cause of this problem and how to resolve it? Flyte seems to require proper structure of python modules, which might not be the case in the Jupyter notebook. Thanks,
Hello community, does anybody have any insight into modifying the gRPC requests that go on in the ba...
t

Tommy Nam

over 2 years ago
Hello community, does anybody have any insight into modifying the gRPC requests that go on in the backend after a successfully authenticated
pyflyte run
command with an external auth server? I can see that it is part of the flyteIdl repository but have not worked with Go or gRPC in the past - though I am open to trying nonetheless. We are receiving 403 forbidden errors due to the
flyte-binary
pod/deployment being unable to send the
audience
parameter. I am assuming that this is between FlyteAdmin and FlytePropeller, though I could be wrong. So essentially, the flow is like this: • Localhost/client/machine sends
pyflyte run --remote
command to gRPC backend - AWS ALB w/ SSL/TLS • Auth request is successful for pyflyte/flytekit - using Auth0 as external auth server • Web console registers and then displays workflow with UNKNOWN status • No Pods that were requested in the pyflyte command are scheduled • Inspection of flyte-binary deployment w/ kubectl shows that a missing
audience
parameter is needed • Tons of requests with 403 - seems retry logic never stops - Necessitates killing deployment • Auth0 Logs show that all the requests fail due to audience Relevent Github issue is here with further logs/details: https://github.com/flyteorg/flyte/issues/3662 Any assistance in the matter would be greatly appreciated. We believe that this is the final step in getting Auth0 working with a flyte-binary deployment and would be more than glad to provide supporting documentation/code from forks if need be.