https://flyte.org logo
#ask-the-community
Title
# ask-the-community
h

HIMANSHU JOSHI

02/07/2023, 1:28 PM
Hey guyz, I did a setup of flyte on GCP. I'm able to access the UI but on running pyflyte run --remote sample_workflow.py main --x 10 --y 20 I'm getting this error,
Copy code
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses; last error: UNKNOWN: ipv4:172.25.0.46:443: Ssl handshake failed: SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED"
        debug_error_string = "UNKNOWN:failed to connect to all addresses; last error: UNKNOWN: ipv4:172.25.0.46:443: Ssl handshake failed: SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED {grpc_status:14, created_time:"2023-02-07T18:47:06.951145+05:30"}"
Any body has an idea what is the problem here? P.s I have TLS disabled on ingress.
b

Bernhard Stadlbauer

02/07/2023, 1:39 PM
Hi @HIMANSHU JOSHI! Do you have
admin.insecure: True
set in your
~/.flyte/config.yaml
?
h

HIMANSHU JOSHI

02/07/2023, 1:41 PM
no it's set to false
Copy code
admin:
  endpoint: "dns:///"
#  authType: Pkce
  insecure: false
  insecureSkipVerify: true
console:
  endpoint: ""
logger:
  show-source: true
  level: 0
if i set it to true i'm not even able to use
flytectl get projects
@Bernhard Stadlbauer any suggestion on this?
b

Bernhard Stadlbauer

02/07/2023, 1:55 PM
Not really, sorry!
h

HIMANSHU JOSHI

02/07/2023, 2:27 PM
<!here> Anybody else can help with this?
d

David Espejo (he/him)

02/07/2023, 3:10 PM
@HIMANSHU JOSHI what IdP are you using for authentication?
Looking at previous messages in this channel, some users have experienced this error and adding the redirect URI to the IdP configuration has been helpful.
h

HIMANSHU JOSHI

02/07/2023, 4:09 PM
We've deployed flyte in GKE with gsa-flyteadmin service account apart from that we're not using any IdP
k

Ketan (kumare3)

02/07/2023, 4:16 PM
This is unauthenticated
Cc @Eduardo Apolinario (eapolinario) / @Sujith Samuel seems to be similar problem
Grpc is having some ssl trouble in case of your own org signed certs
In python and flytectl works
@HIMANSHU JOSHI problem is we are unable to reproduce this
Can you write pip show Grpcio
h

HIMANSHU JOSHI

02/07/2023, 4:30 PM
Copy code
Name: grpcio
Version: 1.51.1
Summary: HTTP/2-based RPC framework
Home-page: <https://grpc.io>
Author: The gRPC Authors
Author-email: <mailto:grpc-io@googlegroups.com|grpc-io@googlegroups.com>
License: Apache License 2.0
Location: "<package-path>"
Requires: 
Required-by: flytekit, grpcio-status
@Ketan (kumare3) ^ o/p of pip show grpcio
e

Eduardo Apolinario (eapolinario)

02/07/2023, 4:45 PM
@HIMANSHU JOSHI, just to confirm, those are self-signed certs you're using, right? If not, how did you generate the certs?
s

Sujith Samuel

02/07/2023, 4:49 PM
Yes. This is the same issue.
Self signed cert is not liked by python grpc client
h

HIMANSHU JOSHI

02/07/2023, 5:04 PM
what do we mean by self-asigned cert? cert to access the cluster?
a

Ankit Goyal

02/07/2023, 5:17 PM
When you access the console from the browser, do you get "Your connection is not private"?
s

Sujith Samuel

02/07/2023, 5:17 PM
by self signed I meant that the one signed by my org CA, not Mozilla or Digicert and the likes
h

HIMANSHU JOSHI

02/07/2023, 5:18 PM
that's only the case if i user https://<dns>/console but i use http://
a

Ankit Goyal

02/07/2023, 5:19 PM
I see, when you access using
https
, can you click on the
not secure
Can you click on the
certificate is not valid
it should show you the CA
h

HIMANSHU JOSHI

02/07/2023, 5:19 PM
yaa it says certificate is not valid
a

Ankit Goyal

02/07/2023, 5:20 PM
If you click on Certificate is not valid, it will show you the CA:
h

HIMANSHU JOSHI

02/07/2023, 5:20 PM
yaa it shows something
a

Ankit Goyal

02/07/2023, 5:21 PM
What does
Issue by
->
Common Name (CN)
say?
h

HIMANSHU JOSHI

02/07/2023, 5:21 PM
k8s ingress controller cert
a

Ankit Goyal

02/07/2023, 5:22 PM
ok, that seems like a self signed cert
Do you have the option to download the ca-bundle ?
h

HIMANSHU JOSHI

02/07/2023, 5:23 PM
Nope
a

Ankit Goyal

02/07/2023, 5:23 PM
What ingress controller are you using? Are you able to put a custom cert there?
h

HIMANSHU JOSHI

02/07/2023, 5:24 PM
ngnix ingress
btw i was able to download .cer fike
*file
a

Ankit Goyal

02/07/2023, 5:26 PM
ok, in your config file, you can try pointing to it:
Copy code
admin:
  
  caCertFilePath: /path/to/ca-bundle.crt
h

HIMANSHU JOSHI

02/07/2023, 5:27 PM
but flytectl is working with insecureSkipVerify : true
it just pyflyte cli that's causing the issue
do they not use same config?
a

Ankit Goyal

02/07/2023, 5:27 PM
they are.. but with the bundle you can enable the verify
We had the same issues with pyflyte and we were able to fix using the proper certs
h

HIMANSHU JOSHI

02/07/2023, 5:30 PM
certificate is valid for ingress.local not <dns> error
a

Ankit Goyal

02/07/2023, 5:31 PM
i am curious, if you have a non-ssl endpoint available, why is SSL happening?
h

HIMANSHU JOSHI

02/07/2023, 5:33 PM
the dns is only accessible over a VPN is this bcz of that?
and also one more question when i explicitly mentioned insecureSkipVerify: true why pyflyte not taking that into consideration?
a

Ankit Goyal

02/07/2023, 5:33 PM
are you using the non-ssl port in flyte configs?
Python GRPC doesn't behave well with skipping certificate verification..
I would try this: use the non-ssl endpoint with
admin.insecure: True
h

HIMANSHU JOSHI

02/08/2023, 12:18 PM
hey, can we package the whole module using
pyflyte package
i tried it and it throws an error i.e. submodule not found? P.s We're still not able to resolve ssl issue so can't use pyflyte register bcz of that
s

Samhita Alla

02/08/2023, 12:19 PM
pyflyte package
should work. Could you share the exact error?
h

HIMANSHU JOSHI

02/08/2023, 12:20 PM
ModuleNotFoundError: No module named 'utils'
utils is a submodule in the module
s

Samhita Alla

02/08/2023, 12:21 PM
Does the package you're sending to
pyflyte package
contain
utils
?
h

HIMANSHU JOSHI

02/08/2023, 12:22 PM
Copy code
pyflyte --pkgs dynamic_workflow package --image <image-name> -f -o dynamic_workflow.tgz
command for reference
Does the package you're sending to
pyflyte package
contain
utils
?
yes
@Samhita Alla am i using the wrong command or is there a another way to package the whole module?
s

Samhita Alla

02/08/2023, 12:26 PM
Can you do a relative import?
from .utils ...
h

HIMANSHU JOSHI

02/08/2023, 12:50 PM
the workflow did ran on doing relative import. but propeller is throwing this error
Error when trying to reconcile workflow
and in the dynamic task it was suppose to create 5 different task as i have given 5 input it did create those but it's been 20 min it is still stuck in queued and propeller is throwing above mentioned error and only one pod for dynamic task was created expecting 5
b

Bernhard Stadlbauer

02/08/2023, 12:52 PM
Hi @HIMANSHU JOSHI! Sorry, this is slightly unrelated, but do you also see
rpc error: code = InvalidArgument desc = missing project
somewhere in your propeller logs?
h

HIMANSHU JOSHI

02/08/2023, 12:52 PM
yess
Copy code
[Feb 8, 2023 6:20:04 PM GMT+5]
{"json":{"exec_id":"ahlrb9l95q2h4gg8fhgf","node":"n0/dn0","ns":"development","res_ver":"55914809","routine":"worker-14","wf":"flyteexamples:development:dynamic_workflow.main.dynamic_workflow"},"level":"error","msg":"handling parent node failed with error: InvalidArgument: Invalid fields for event message, caused by [rpc error: code = InvalidArgument desc = missing project]","ts":"2023-02-08T12:50:04Z"}
[Feb 8, 2023 6:20:04 PM GMT+5]
{"json":{"exec_id":"ahlrb9l95q2h4gg8fhgf","node":"n0/dn0","ns":"development","res_ver":"55914809","routine":"worker-14","wf":"flyteexamples:development:dynamic_workflow.main.dynamic_workflow"},"level":"error","msg":"failed Execute for node. Error: InvalidArgument: Invalid fields for event message, caused by [rpc error: code = InvalidArgument desc = missing project]","ts":"2023-02-08T12:50:04Z"}
[Feb 8, 2023 6:20:04 PM GMT+5]
{"json":{"exec_id":"ahlrb9l95q2h4gg8fhgf","node":"n0","ns":"development","res_ver":"55914809","routine":"worker-14","wf":"flyteexamples:development:dynamic_workflow.main.dynamic_workflow"},"level":"error","msg":"handling dynamic subnodes failed with error: InvalidArgument: Invalid fields for event message, caused by [rpc error: code = InvalidArgument desc = missing project]","ts":"2023-02-08T12:50:04Z"}
[Feb 8, 2023 6:20:04 PM GMT+5]
{"json":{"exec_id":"ahlrb9l95q2h4gg8fhgf","node":"n0","ns":"development","res_ver":"55914809","routine":"worker-14","wf":"flyteexamples:development:dynamic_workflow.main.dynamic_workflow"},"level":"error","msg":"failed Execute for node. Error: InvalidArgument: Invalid fields for event message, caused by [rpc error: code = InvalidArgument desc = missing project]","ts":"2023-02-08T12:50:04Z"}
[Feb 8, 2023 6:20:04 PM GMT+5]
{"json":{"exec_id":"ahlrb9l95q2h4gg8fhgf","ns":"development","res_ver":"55914809","routine":"worker-14","wf":"flyteexamples:development:dynamic_workflow.main.dynamic_workflow"},"level":"warning","msg":"Error in handling running workflow [InvalidArgument: Invalid fields for event message, caused by [rpc error: code = InvalidArgument desc = missing project]]","ts":"2023-02-08T12:50:04Z"}
[Feb 8, 2023 6:20:04 PM GMT+5]
{"json":{"exec_id":"ahlrb9l95q2h4gg8fhgf","ns":"development","res_ver":"55914809","routine":"worker-14","wf":"flyteexamples:development:dynamic_workflow.main.dynamic_workflow"},"level":"error","msg":"Error when trying to reconcile workflow. Error [InvalidArgument: Invalid fields for event message, caused by [rpc error: code = InvalidArgument desc = missing project]]. Error Type[*errors.EventError]","ts":"2023-02-08T12:50:04Z"}
[Feb 8, 2023 6:20:04 PM GMT+5]
E0208 12:50:04.456196 1 workers.go:102] error syncing 'development/ahlrb9l95q2h4gg8fhgf': InvalidArgument: Invalid fields for event message, caused by [rpc error: code = InvalidArgument desc = missing project]
error logs ^
b

Bernhard Stadlbauer

02/08/2023, 12:53 PM
Ok, I feel like this might be related to this thread. I am currently trying to debug this in propeller
cc @Dan Rammer (hamersaw)
h

HIMANSHU JOSHI

02/08/2023, 12:59 PM
so inshort dynamic workflow always faces this issue? though i tried it on sandbox it worked
just on remote cluster ^
b

Bernhard Stadlbauer

02/08/2023, 1:09 PM
I’ve just tried and seems like the error is in flytekit somewhere. Downgrading to 1.2.7 worked for me
I think the offending PR is this one. cc @Eduardo Apolinario (eapolinario)
h

HIMANSHU JOSHI

02/08/2023, 3:00 PM
Hey @Bernhard Stadlbauer does map_task also face same issue?
k

Ketan (kumare3)

02/08/2023, 3:01 PM
I think this is dynamic. I am surprised this made through the as of indeed a bug. It should not need a project
Project/domain should not matter for dynamic tasks
h

HIMANSHU JOSHI

02/08/2023, 3:01 PM
I meant i know dynamic has this problem but bcz map_task also spawns multiple pods will that also face same problem
k

Ketan (kumare3)

02/08/2023, 3:01 PM
We will follow up, please downgrade for now
Cc @Eduardo Apolinario (eapolinario)
h

HIMANSHU JOSHI

02/08/2023, 4:18 PM
Copy code
{"json":{"exec_id":"as6j9bshk8qx5rfc6x7f","node":"n0","ns":"development","res_ver":"56079374","routine":"worker-0","wf":"flyteexamples:development:dynamic_workflow.main.dynamic_workflow"},"level":"error","msg":"failed Execute for node. Error: InvalidArgument: Invalid fields for event message, caused by [rpc error: code = InvalidArgument desc = missing project]","ts":"2023-02-08T16:14:31Z"}
{"json":{"exec_id":"as6j9bshk8qx5rfc6x7f","ns":"development","res_ver":"56079374","routine":"worker-0","wf":"flyteexamples:development:dynamic_workflow.main.dynamic_workflow"},"level":"warning","msg":"Error in handling running workflow [InvalidArgument: Invalid fields for event message, caused by [rpc error: code = InvalidArgument desc = missing project]]","ts":"2023-02-08T16:14:31Z"}
@Ketan (kumare3) @Bernhard Stadlbauer still the same error even after downgrading flytekit to 1.2.7. do i also need to downgrade flytepropeller / <all the deployments> docker image to the same version?
e

Eduardo Apolinario (eapolinario)

02/08/2023, 4:23 PM
@HIMANSHU JOSHI, can you double-check you're running flytekit 1.2.7 again? This should not happen if you're using flytekit<1.3.0
h

HIMANSHU JOSHI

02/08/2023, 4:25 PM
yup it's 1.2.7
e

Eduardo Apolinario (eapolinario)

02/08/2023, 4:26 PM
ok, and how are you registering your workflows+tasks?
h

HIMANSHU JOSHI

02/08/2023, 4:26 PM
Copy code
pyflyte --pkgs dynamic_workflow.main package --image <image-name> -f -o dynamic_workflow.tgz

flytectl register files --project flyteexamples --domain development --archive dynamic_workflow.tgz --version v2
^ @Eduardo Apolinario (eapolinario)
e

Eduardo Apolinario (eapolinario)

02/08/2023, 4:31 PM
sorry, I meant the version of flytekit installed in the image.
h

HIMANSHU JOSHI

02/08/2023, 4:33 PM
oh checking that one sec
oh my bad i didn't change flytekit version in image after changing that too to 1.2.7 I no longer face the error thanks for the help
e

Eduardo Apolinario (eapolinario)

02/08/2023, 11:42 PM
@HIMANSHU JOSHI, @Bernhard Stadlbauer, we just released flytekit 1.3.2 which should have a fix for this. We'll be following up on https://github.com/flyteorg/flyte/issues/3324 for the long term fix.
k

Ketan (kumare3)

02/09/2023, 1:13 AM
I would also say, we are extremely sorry to have released this. this was a complete miss and break in our process
h

HIMANSHU JOSHI

02/09/2023, 5:55 AM
Hey, isn't there any way of doing a insecure
grpc
connection through
pyflyte
?
e

Eduardo Apolinario (eapolinario)

02/09/2023, 6:57 AM
@HIMANSHU JOSHI,you should be able to set
insecure: true
in the config file. What error are you seeing?
h

HIMANSHU JOSHI

02/09/2023, 1:27 PM
Copy code
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses; last error: INTERNAL: Trying to connect an http1.x server"
        debug_error_string = "UNKNOWN:Failed to pick subchannel {created_time:"2023-02-09T18:57:01.276539+05:30", children:[UNKNOWN:failed to connect to all addresses; last error: INTERNAL: Trying to connect an http1.x server {grpc_status:14, created_time:"2023-02-09T18:57:01.27653+05:30"}]}"
sorry just saw the message this is the error i'm getting for
pyflyte
if i put
insecure:true
and for
flytectl
,
Copy code
rpc error: code = Unavailable desc = connection closed before server preface received
but error with
flytectl
goes away if i use
insecure: false
and
insecureSkipVerify: true
but than i get
openssl handshake failed
for
pyflyte
in that case P.s
TLS
disabled Hope I was able to explain the problem😅
@Eduardo Apolinario (eapolinario) Any suggestion?
k

Ketan (kumare3)

02/09/2023, 6:27 PM
@HIMANSHU JOSHI this is what we were discussing - seems you have a self signed cert. thr cert does not follow the rfc and python Grpc client does not allow this
@Eduardo Apolinario (eapolinario) did you have other suggestions
228 Views