Hi Everyone. I'm new to Flyte so please bear with me..
I've started out with this tutorial: https://
docs.flyte.org/en/v1.0.0/getting_started/index.html all fine so far, I can run the example with the --remote flag.
Next I would like to run the example on a remote Kubernetes cluster hosted on AWS. I ran through all steps here:
https://docs.flyte.org/en/v1.0.0/deployment/aws/manual.html which seems to be all successful
kubectl -n flyte get ingress
NAME CLASS HOSTS ADDRESS PORTS AGE
flyte-core <none> * k8s-flyte-123456789.eu-west-1.elb.amazonaws.com 80 17h
flyte-core-grpc <none> * k8s-flyte-123456789-721009295.eu-west-1.elb.amazonaws.com 80 17h
I updated the config.yaml file to point to the ingress endpoints:
cat ~/.flyte/config.yaml
admin:
# For GRPC endpoints you might want to use dns:///flyte.myexample.com
endpoint: dns:///k8s-flyte-123456789.eu-west-1.elb.amazonaws.com
insecureSkipVerify: true
authType: Pkce
insecure: true
logger:
show-source: true
level: 6
I also updated my environment variables to point to the config.yaml
_echo $FLYTECTL_CONFIG_
/home/my-username/.flyte/config.yaml
echo $KUBECONFIG
/home/my username/.kube/config/home/gajus/.flyte/k3s/k3s.yaml
Here’s the problem, when I run:
_FLYTE_SDK_LOGGING_LEVEL=20 pyflyte run --remote example.py wf --n 500 --mean 42 --sigma 2_
I get the following error:
_FLYTE_SDK_LOGGING_LEVEL=20 pyflyte run --remote example.py wf --n 500 --mean 42 --sigma 2_
{"asctime": "2023-09-15 080754,296", "name": "flytekit", "levelname": "INFO", "message": "Using flytectl/YAML config /home/my-username/.flyte/config.yaml"}
{"asctime": "2023-09-15 080754,298", "name": "flytekit", "levelname": "INFO", "message": "Setting protocol to file"}
{"asctime": "2023-09-15 080754,303", "name": "flytekit", "levelname": "INFO", "message": "Using flytectl/YAML config /home/my-username/.flyte/config.yaml"}
{"asctime": "2023-09-15 080754,304", "name": "flytekit", "levelname": "INFO", "message": "Using flytectl/YAML config /home/my-username/.flyte/config.yaml"}
{"asctime": "2023-09-15 080754,630", "name": "flytekit", "levelname": "INFO", "message": "We won't register PyTorchCheckpointTransformer, PyTorchTensorTransformer, and PyTorchModuleTransformer because torch is not installed."}
{"asctime": "2023-09-15 080754,641", "name": "flytekit", "levelname": "INFO", "message": "Setting protocol to file"}
{"asctime": "2023-09-15 080754,641", "name": "flytekit", "levelname": "INFO", "message": "Setting protocol to file"}
{"asctime": "2023-09-15 080754,642", "name": "flytekit", "levelname": "INFO", "message": "Setting protocol to file"}
{"asctime": "2023-09-15 080754,642", "name": "flytekit", "levelname": "INFO", "message": "Setting protocol to file"}
{"asctime": "2023-09-15 080754,643", "name": "flytekit", "levelname": "INFO", "message": "We won't register bigquery handler for structured dataset because we can't find the packages google-cloud-bigquery-storage and google-cloud-bigquery"}
{"asctime": "2023-09-15 080754,684", "name": "flytekit", "levelname": "INFO", "message": "Using flytectl/YAML config /home/my-username/.flyte/config.yaml"}
_{"asctime": "2023-09-15 08
0754,685", "name": "flytekit.cli", "levelname": "INFO", "message": "Creating remote with config Config(platform=PlatformConfig(endpoint='
k8s-flyte-123456789.eu-west-1.elb.amazonaws.com', insecure=True, insecure_skip_verify=True, console_endpoint=None, command=None, client_id=None, client_credentials_secret=None, scopes=[], auth_mode='Pkce'), secrets=SecretsConfig(env_prefix='_FSEC_', default_dir='/etc/secrets', file_prefix=''), stats=StatsConfig(host='localhost', port=8125, disabled=False, disabled_tags=False), data_config=DataConfig(s3=S3Config(enable_debug=False, endpoint=None, retries=3, backoff=datetime.timedelta(seconds=5), access_key_id=None, secret_access_key=None), gcs=GCSConfig(gsutil_parallelism=False)), local_sandbox_path='/tmp/flyteysnmzvj4')"}_
{"asctime": "2023-09-15 080754,884", "name": "flytekit.cli", "levelname": "INFO", "message": "Flyte Client configured -> k8s-flyte-123456789.eu-west-1.elb.amazonaws.com in insecure mode."}
_{"asctime": "2023-09-15 08
0754,886", "name": "flytekit.cli", "levelname": "ERROR", "message": "Non-auth RPC error <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNAVAILABLE\n\tdetails = \"failed to connect to all addresses; last error: INTERNAL: Trying to connect an http1.x server\"\n\tdebug_error_string = \"UNKNOWN:Failed to pick subchannel {created_time:\"2023-09-15T08
0754.8865895+02:00\", children
[UNKNOWNfailed to connect to all addresses; last error: INTERNAL: Trying to connect an http1.x server {grpc_status:14, created_time:\"2023-09-15T08
0754.8865848+02:00\"}]}\"\n>, sleeping 200ms and retrying"}_
_{"asctime": "2023-09-15 08
0755,087", "name": "flytekit.cli", "levelname": "ERROR", "message": "Non-auth RPC error <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNAVAILABLE\n\tdetails = \"failed to connect to all addresses; last error: INTERNAL: Trying to connect an http1.x server\"\n\tdebug_error_string = \"UNKNOWN:Failed to pick subchannel {created_time:\"2023-09-15T08
0755.0874841+02:00\", children
[UNKNOWNfailed to connect to all addresses; last error: INTERNAL: Trying to connect an http1.x server {grpc_status:14, created_time:\"2023-09-15T08
0755.0874779+02:00\"}]}\"\n>, sleeping 400ms and retrying"}_
Traceback (most recent call last):
File "/home/my-username/miniconda3/envs/flyte/bin/pyflyte", line 8, in <module>
sys.exit(main())
_File "/home/my-username/miniconda3/envs/flyte/lib/python3.10/site-packages/click/core.py", line 1130, in __call___
return self.main(*args, **kwargs)
File "/home/my-username/miniconda3/envs/flyte/lib/python3.10/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/home/my-username/miniconda3/envs/flyte/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
_return _process_result(sub_ctx.command.invoke(sub_ctx))_
File "/home/my-username/miniconda3/envs/flyte/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
_return _process_result(sub_ctx.command.invoke(sub_ctx))_
File "/home/my-username/miniconda3/envs/flyte/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
_return _process_result(sub_ctx.command.invoke(sub_ctx))_
File "/home/my-username/miniconda3/envs/flyte/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/my-username/miniconda3/envs/flyte/lib/python3.10/site-packages/click/core.py", line 760, in invoke
_return
_callback(*args, **kwargs)
_File "/home/my-username/miniconda3/envs/flyte/lib/python3.10/site-packages/flytekit/clis/sdk_in_container/run.py", line 529, in
run
_remote_entity = remote.register_script(_
_File "/home/my-username/miniconda3/envs/flyte/lib/python3.10/site-packages/flytekit/remote/remote.py", line 671, in register_script_
_upload_location, md5_bytes = fast_register_single_script(_
_File "/home/my-username/miniconda3/envs/flyte/lib/python3.10/site-packages/flytekit/tools/script_mode.py", line 111, in fast_register_single_script_
_upload_location = create_upload_location_fn(content_md5=md5)_
_File "/home/my-username/miniconda3/envs/flyte/lib/python3.10/site-packages/flytekit/clients/friendly.py", line 998, in get_upload_signed_url_
_return super(SynchronousFlyteClient, self).create_upload_location(_
File "/home/my-username/miniconda3/envs/flyte/lib/python3.10/site-packages/flytekit/clients/raw.py", line 41, in handler
return fn(*args, **kwargs)
_File "/home/my-username/miniconda3/envs/flyte/lib/python3.10/site-packages/flytekit/clients/raw.py", line 856, in create_upload_location_
_return self._dataproxy_stub.CreateUploadLocation(create_upload_location_request, metadata=self.
metadata)
_File "/home/my-username/miniconda3/envs/flyte/lib/python3.10/site-packages/grpc/_channel.py", line 946, in __call___
_return _end_unary_response_blocking(state, call, False, None)_
_File "/home/my-username/miniconda3/envs/flyte/lib/python3.10/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking_
_raise
InactiveRpcError(state)
_grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:_
status = StatusCode.UNAVAILABLE
details = "failed to connect to all addresses; last error: INTERNAL: Trying to connect an http1.x server"
_debug_error_string = "UNKNOWN:Failed to pick subchannel {created_time:"2023-09-15T08
0755.488686825+02:00", children
[UNKNOWNfailed to connect to all addresses; last error: INTERNAL: Trying to connect an http1.x server {created_time:"2023-09-15T08
0755.4886694+02:00", grpc_status:14}]}"_
This seems to have something to do with gRPC not liking my self-signed certificate. Does anyone have any idea’s on how to fix this? Any help would be greatly appreciated (P.s. I am running pyflyte on WSL Ubuntu 20.04)