Having trouble running `pyflyte run --remote core/...
# flyte-deployment
a
Having trouble running
pyflyte run --remote core/flyte_basics/hello_world.py my_wf
after setting up a cluster on gcloud, I think it has something to do with my SSL setup being incorrect?
k
What's the error
We need to Improve error messages for Grpc
a
Dang i lost the original error trying to debug this
I think the core thing I’m confused about is
It talks about how to set up google managed ssl cert, but it seems like we can’t actually use that without gke ingress and the rest of the tutorial uses nginx-ingress?
When I try the nginx/cert-manager way that is described in the tutorial that doesnt work
(`
Copy code
helm install cert-manager --namespace flyte --version v0.12.0 jetstack/cert-manager
`)
Error: failed to install CRD crds/certificaterequests.yaml: unable to recognize "": no matches for kind "CustomResourceDefinition" in version "<http://apiextensions.k8s.io/v1beta1|apiextensions.k8s.io/v1beta1>"
Ok, got past that issue (happy to share what I did), but getting this error now
Copy code
details = "failed to create a signed url. Error: unable to sign bytes: googleapi: Error 403: Request had insufficient authentication scopes.
Details:
[
  {
    "@type": "<http://type.googleapis.com/google.rpc.ErrorInfo|type.googleapis.com/google.rpc.ErrorInfo>",
    "domain": "<http://googleapis.com|googleapis.com>",
    "metadata": {
      "method": "google.iam.credentials.v1.IAMCredentials.SignBlob",
      "service": "<http://iamcredentials.googleapis.com|iamcredentials.googleapis.com>"
    },
    "reason": "ACCESS_TOKEN_SCOPE_INSUFFICIENT"
  }
]"
k
I think the service account for FlyteAdmin does not have right Iam perms
a
I added the the permission to FlyteAdmin as instructed actually
In fact I added it to all Flyte roles to see if that helps
k
This is only admin
Seems right
Did you use workload identity to attach
a
In case it is helpful, here’s what I did to fix the earlier SSL problem: https://docs.google.com/document/d/1skJWmt3hJoIuPQr_RfR-gB9wlatVSIcSD5VlBylJqd8/edit?usp=sharing
k
Cc @Smriti Satyan can we doc this
a
If anyone had an advice on fixing these IAM issues, I’d really appreciate it! Ive tried a lot of different things
k
Ohh sorry, cc @jeev do you know
j
hmm not sure. @Armaan Goel: where was the
ACCESS_TOKEN_SCOPE_INSUFFICIENT
message? locally?
flyte shouldnt be using signed urls for gcp.
do you have this config for admin, propeller, and data catalog?
Copy code
storage:
  type: stow
  stow:
    kind: google
    config:
      json: ""
      # replace with the GCP project ID
      project_id:
      scopes: <https://www.googleapis.com/auth/devstorage.read_write>
k
@jeev infact we do, this is for pyflyte run
j
would that make a difference @Ketan (kumare3)? going based off of this error:
Copy code
details = "failed to create a signed url. Error: unable to sign bytes: googleapi: Error 403: Request had insufficient authentication scopes.
it shouldnt have to create signed urls, right?
im curious where this error message was observed. was it in the flyte control plane or locally when running the
pyflyte run
command
a
I think it was in the control plane it looked like a remote error that was just communicated by grpc
Let me find the exact error
Copy code
armaangoel78@cloudshell:~/Gauntlet/flytesnacks/cookbook (gauntletai)$ pyflyte run --remote core/flyte_basics/hello_world.py my_wf
{"asctime": "2022-09-19 19:16:13,698", "name": "flytekit.cli", "levelname": "ERROR", "message": "Non-auth RPC error <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNKNOWN\n\tdetails = \"failed to create a signed url. Error: unable to sign bytes: googleapi: Error 403: Request had insufficient authentication scopes.\nDetails:\n[\n  {\n    \"@type\": \"<http://type.googleapis.com/google.rpc.ErrorInfo\|type.googleapis.com/google.rpc.ErrorInfo\>",\n    \"domain\": \"googleapis.com\",\n    \"metadata\": {\n      \"method\": \"google.iam.credentials.v1.IAMCredentials.SignBlob\",\n      \"service\": \"iamcredentials.googleapis.com\"\n    },\n    \"reason\": \"ACCESS_TOKEN_SCOPE_INSUFFICIENT\"\n  }\n]\"\n\tdebug_error_string = \"{\"created\":\"@1663614973.698578904\",\"description\":\"Error received from peer ipv4:35.199.161.247:443\",\"file\":\"src/core/lib/surface/call.cc\",\"file_line\":966,\"grpc_message\":\"failed to create a signed url. Error: unable to sign bytes: googleapi: Error 403: Request had insufficient authentication scopes.\\nDetails:\\n[\\n  {\\n    \"@type\": \"type.googleapis.com/google.rpc.ErrorInfo\",\\n    \"domain\": \"<http://googleapis.com|googleapis.com>\",\\n    \"metadata\": {\\n      \"method\": \"google.iam.credentials.v1.IAMCredentials.SignBlob\",\\n      \"service\": \"<http://iamcredentials.googleapis.com|iamcredentials.googleapis.com>\"\\n    },\\n    \"reason\": \"ACCESS_TOKEN_SCOPE_INSUFFICIENT\"\\n  }\\n]\",\"grpc_status\":2}\"\n>, sleeping 200ms and retrying"}
{"asctime": "2022-09-19 19:16:13,917", "name": "flytekit.cli", "levelname": "ERROR", "message": "Non-auth RPC error <_InactiveRpcError of RPC that terminated with:\n\tstatus = StatusCode.UNKNOWN\n\tdetails = \"failed to create a signed url. Error: unable to sign bytes: googleapi: Error 403: Request had insufficient authentication scopes.\nDetails:\n[\n  {\n    \"@type\": \"<http://type.googleapis.com/google.rpc.ErrorInfo\|type.googleapis.com/google.rpc.ErrorInfo\>",\n    \"domain\": \"googleapis.com\",\n    \"metadata\": {\n      \"method\": \"google.iam.credentials.v1.IAMCredentials.SignBlob\",\n      \"service\": \"iamcredentials.googleapis.com\"\n    },\n    \"reason\": \"ACCESS_TOKEN_SCOPE_INSUFFICIENT\"\n  }\n]\"\n\tdebug_error_string = \"{\"created\":\"@1663614973.917603494\",\"description\":\"Error received from peer ipv4:35.199.161.247:443\",\"file\":\"src/core/lib/surface/call.cc\",\"file_line\":966,\"grpc_message\":\"failed to create a signed url. Error: unable to sign bytes: googleapi: Error 403: Request had insufficient authentication scopes.\\nDetails:\\n[\\n  {\\n    \"@type\": \"type.googleapis.com/google.rpc.ErrorInfo\",\\n    \"domain\": \"<http://googleapis.com|googleapis.com>\",\\n    \"metadata\": {\\n      \"method\": \"google.iam.credentials.v1.IAMCredentials.SignBlob\",\\n      \"service\": \"<http://iamcredentials.googleapis.com|iamcredentials.googleapis.com>\"\\n    },\\n    \"reason\": \"ACCESS_TOKEN_SCOPE_INSUFFICIENT\"\\n  }\\n]\",\"grpc_status\":2}\"\n>, sleeping 400ms and retrying"}
Traceback (most recent call last):
  File "/home/armaangoel78/.local/bin/pyflyte", line 8, in <module>
    sys.exit(main())
  File "/home/armaangoel78/.local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/armaangoel78/.local/lib/python3.9/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/armaangoel78/.local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/armaangoel78/.local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/armaangoel78/.local/lib/python3.9/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/armaangoel78/.local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/armaangoel78/.local/lib/python3.9/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/armaangoel78/.local/lib/python3.9/site-packages/flytekit/clis/sdk_in_container/run.py", line 539, in _run
    remote_entity = remote.register_script(
  File "/home/armaangoel78/.local/lib/python3.9/site-packages/flytekit/remote/remote.py", line 596, in register_script
    upload_location, md5_bytes = fast_register_single_script(
  File "/home/armaangoel78/.local/lib/python3.9/site-packages/flytekit/tools/script_mode.py", line 113, in fast_register_single_script
    upload_location = create_upload_location_fn(content_md5=md5)
  File "/home/armaangoel78/.local/lib/python3.9/site-packages/flytekit/clients/friendly.py", line 998, in get_upload_signed_url
    return super(SynchronousFlyteClient, self).create_upload_location(
  File "/home/armaangoel78/.local/lib/python3.9/site-packages/flytekit/clients/raw.py", line 41, in handler
    return fn(*args, **kwargs)
  File "/home/armaangoel78/.local/lib/python3.9/site-packages/flytekit/clients/raw.py", line 854, in create_upload_location
    return self._dataproxy_stub.CreateUploadLocation(create_upload_location_request, metadata=self._metadata)
  File "/home/armaangoel78/.local/lib/python3.9/site-packages/grpc/_channel.py", line 946, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/home/armaangoel78/.local/lib/python3.9/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
    raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNKNOWN
        details = "failed to create a signed url. Error: unable to sign bytes: googleapi: Error 403: Request had insufficient authentication scopes.
Details:
[
  {
    "@type": "type.googleapis.com/google.rpc.ErrorInfo",
    "domain": "googleapis.com",
    "metadata": {
      "method": "google.iam.credentials.v1.IAMCredentials.SignBlob",
      "service": "iamcredentials.googleapis.com"
    },
    "reason": "ACCESS_TOKEN_SCOPE_INSUFFICIENT"
  }
]"
        debug_error_string = "{"created":"@1663614974.333626746","description":"Error received from peer ipv4:35.199.161.247:443","file":"src/core/lib/surface/call.cc","file_line":966,"grpc_message":"failed to create a signed url. Error: unable to sign bytes: googleapi: Error 403: Request had insufficient authentication scopes.\nDetails:\n[\n  {\n    "@type": "type.googleapis.com/google.rpc.ErrorInfo",\n    "domain": "googleapis.com",\n    "metadata": {\n      "method": "google.iam.credentials.v1.IAMCredentials.SignBlob",\n      "service": "<http://iamcredentials.googleapis.com|iamcredentials.googleapis.com>"\n    },\n    "reason": "ACCESS_TOKEN_SCOPE_INSUFFICIENT"\n  }\n]","grpc_status":2}"
>
Also @jeev how did u get that view for the admin, propeller, etc config?
I just see a list of permissions on the console at least
j
it should be in the respective kubernetes configmaps
you can go to Kubernetes Engine > Secrets and ConfigMaps and browse there
a
Oh ill take a look
j
or just do
kubectl get configmap -n flyte <name of configmap> -o yaml
this looks like the likely source:
Copy code
File "/home/armaangoel78/.local/lib/python3.9/site-packages/flytekit/clients/raw.py", line 854, in create_upload_location
    return self._dataproxy_stub.CreateUploadLocation(create_upload_location_request, metadata=self._metadata)
k
@jeev the error is returned from the FlyteAdmin.
Flyteadmin tries to sign
so my guess is for some reason the permissions for GCP / GCS are inadequate
just do not know how - cc @Haytham Abuelfutuh
j
you probably just need additional permissions to sign the urls: https://cloud.google.com/storage/docs/access-control/signing-urls-manually#prereqs
see the starred note
TIL about pyflyte run 😛
actually the above permissions look like they already encompass that
@Armaan Goel: these look like custom roles in GCP. are these roles attached to the serviceaccount bound to the flyte components?
h
@Armaan Goel Sorry you are having trouble with this! Signed URLs on GCS definitely work so let’s figure out where configs went wrong (and then update docs to tell people exactly what they need 😄) To cover our basis, let’s try canned examples first:
Copy code
flytectl config init --host <flyte url>
flytectl register examples -p flytesnacks -d development
This should register all examples we have in flytesnacks… then go to the UI and run any of the workflows… let’s confirm this scenario works first…
a
Hey @jeev
yeah i thinking they are attached to the gcp roles using workload identity
Hmm different error with that Haytham:
Copy code
----------------------------------------------------------------------------------------------- -------- -------------------------------------------------------
| NAME                                                                                          | STATUS | ADDITIONAL INFO                                       |
 ----------------------------------------------------------------------------------------------- -------- -------------------------------------------------------
| /tmp/register1646303769/snacks-cookbook-case_studies-bioinformatics-blast/0__bash.blastx_1.pb | Failed | Error registering file due to rpc error: code =       |
|                                                                                               |        | Unavailable desc = name resolver error: produced zero |
|                                                                                               |        | addresses                                             |
 ----------------------------------------------------------------------------------------------- -------- -------------------------------------------------------
1 rows
Error: example 0xc0007b0ab0 failed to register rpc error: code = Unavailable desc = name resolver error: produced zero addresses
j
im starting to doubt workload identity or just plain permissions assignment more
a
hmm
so sorry to add some more confusion to them mix, but I’m starting from scratch on a brand new project
But I’m wondering, does this line
Copy code
gcloud iam service-accounts add-iam-policy-binding --role "roles/iam.workloadIdentityUser" --member "serviceAccount:${PROJECT_ID}.svc.id.goog[flyte/flyteadmin]" gsa-flyteadmin@${PROJECT_ID}.<http://iam.gserviceaccount.com|iam.gserviceaccount.com>
need to be run after the cluster is created?
It errors, and I dont think it can be run because we havent created a cluster yet
j
wait if you dont have a cluster, where is flyte running?
a
sorry yeah im confusing things by restarting
all of the stuff I showed earlier is running on a flyte cluster,
i just started a new project, where I am following the tutorail from scratch, so i havent gotten to the cluster creation step yet, and i have this seperate issue
WOOOOOHOOOO I got it!
I’ll document what I did in case people run into the same thing in the future
but basically yes you do need to turn on workload identity for the cluster, but you also need to turn on “GKE Metadata Server” for the individual node pools (https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#option_2_node_pool_modification)
Really appreciate everyone lending a hand to take a look here!
j
glad to hear it worked out @Armaan Goel!
531 Views