Dear all, I ran into an inconsistency between fly...
# ask-the-community
l
Dear all, I ran into an inconsistency between flytectl and FlyteRemote which has been reported before. When trying to create a FlyteRemote with this config
admin:
endpoint: dns:///A.B.C.D:PPPPP
insecure: false
authType: Pkce
insecureSkipVerify: true
flytectl works just fine. But when I try to fetch an execution on a FlyteRemote with this code
project = "myproject"
domain = "development"
execution = "a5cphsfgc57nt6nxbknt"
flyte_config_file = "flyte.config.yaml"
remote = FlyteRemote(config=Config.auto(config_file=flyte_config_file))
flyte_workflow_execution = remote.fetch_execution(project=project, domain=domain, name=execution)
I get the following error
Copy code
Traceback (most recent call last):
  File "debug_remote.py", line 12, in <module>
    flyte_workflow_execution = remote.fetch_execution(project=project, domain=domain, name=execution)
  File "/opt/micromamba/envs/OHLI/lib/python3.8/site-packages/flytekit/remote/remote.py", line 353, in fetch_execution
    self.client.get_execution(
  File "/opt/micromamba/envs/OHLI/lib/python3.8/site-packages/flytekit/clients/friendly.py", line 582, in get_execution
    super(SynchronousFlyteClient, self).get_execution(
  File "/opt/micromamba/envs/OHLI/lib/python3.8/site-packages/flytekit/clients/raw.py", line 43, in handler
    return fn(*args, **kwargs)
  File "/opt/micromamba/envs/OHLI/lib/python3.8/site-packages/flytekit/clients/raw.py", line 651, in get_execution
    return self._stub.GetExecution(get_object_request, metadata=self._metadata)
  File "/opt/micromamba/envs/OHLI/lib/python3.8/site-packages/grpc/_channel.py", line 1030, in __call__
    return _end_unary_response_blocking(state, call, False, None)
  File "/opt/micromamba/envs/OHLI/lib/python3.8/site-packages/grpc/_channel.py", line 910, in _end_unary_response_blocking
    raise _InactiveRpcError(state)  # pytype: disable=not-instantiable
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses; last error: UNKNOWN: ipv4:A.B.C.D:PPPPP: Peer name A.B.C.D is not in peer certificate"
        debug_error_string = "UNKNOWN:failed to connect to all addresses; last error: UNKNOWN: ipv4:A.B.C.D:PPPPP: Peer name A.B.C.D is not in peer certificate {created_time:"2023-05-18T14:40:11.018710903+01:00", grpc_status:14}"
>
The cluster is running the flyte-binary Helm chart in version 1.3.0 and I tried flytekit 1.2.11, 1.3.0, and 1.6.1, all resulting in the same error message.
d
is
A.B.C.D
added as a Subject Alt Name in your cert?
l
Thanks for the hint, David. I haven't edited the cert. I thought, that is not needed since I skip the certificate verification with`insecureSkipVerify: true`But I will anyway try out adding the SAN and report back whether it works or not.
So, I configured TLS for my Flyte endpoint which works all fine in the browser and with flytectl. However,
flytekit.FlyteRemote
(the debug script from my first post) still fails. Now, with this error
Copy code
{"asctime": "2023-05-23 09:38:41,298", "name": "flytekit", "levelname": "WARNING", "message": "FlyteSchema is deprecated, use Structured Dataset instead."}
WARNING:root:KeyRing not available, tokens will not be cached. Error: No recommended backend was available. Install a recommended 3rd party backend package; or, install the keyrings.alt package if you want to use the non-recommended backends. See <https://pypi.org/project/keyring> for details.
E0523 09:38:42.232413831    4238 <http://ssl_transport_security.cc:1495]|ssl_transport_security.cc:1495]> Handshake failed with fatal error SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED.
Traceback (most recent call last):
  File "/opt/micromamba/envs/OHLI/lib/python3.8/site-packages/grpc/_interceptor.py", line 241, in continuation
    response, call = self._thunk(new_method).with_call(
  File "/opt/micromamba/envs/OHLI/lib/python3.8/site-packages/grpc/_interceptor.py", line 266, in with_call
    return self._with_call(request,
  File "/opt/micromamba/envs/OHLI/lib/python3.8/site-packages/grpc/_interceptor.py", line 257, in _with_call
    return call.result(), call
  File "/opt/micromamba/envs/OHLI/lib/python3.8/site-packages/grpc/_channel.py", line 343, in result
    raise self
  File "/opt/micromamba/envs/OHLI/lib/python3.8/site-packages/grpc/_interceptor.py", line 241, in continuation
    response, call = self._thunk(new_method).with_call(
  File "/opt/micromamba/envs/OHLI/lib/python3.8/site-packages/grpc/_channel.py", line 957, in with_call
    return _end_unary_response_blocking(state, call, True, None)
  File "/opt/micromamba/envs/OHLI/lib/python3.8/site-packages/grpc/_channel.py", line 849, in _end_unary_response_blocking
    raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
        status = StatusCode.UNAVAILABLE
        details = "failed to connect to all addresses; last error: UNKNOWN: Ssl handshake failed"
        debug_error_string = "UNKNOWN:Failed to pick subchannel {created_time:"2023-05-23T09:38:42.234530152+01:00", children:[UNKNOWN:failed to connect to all addresses; last error: UNKNOWN: Ssl handshake failed {created_time:"2023-05-23T09:38:42.234525804+01:00", grpc_status:14}]}"
>
I installed the CA inside the same container in which I execute the flyte script which throws this error. And as mentioned, flytectl work without problems from within the same container. The flyte config looks now like this
admin:
endpoint: dns:///my.domain.lan:30657
insecure: false
authType: Pkce
Do you have any ideas? Would be greatly appreciated. 🙂
I fixed the problem by providing my root CA file as admin.caCertFilePath in the flyte config. However, there is also a bug in flytekit.clients.auth_helper (https://github.com/flyteorg/flytekit/blob/master/flytekit/clients/auth_helper.py#L179). Currently, it is
credentials = grpc.ssl_channel_credentials(load_cert(<http://cfg.ca|cfg.ca>_cert_file_path))
However, `load_cert`returns an OpenSSL.crypto.X509 object. But`grpc.ssl_channel_credentialsexpects`a bytes string. So, I had to modify the call as follows (to encode the X509 object as bytes):
credentials = grpc.ssl_channel_credentials(crypto.dump_certificate(crypto.FILETYPE_PEM, load_cert(<http://cfg.ca|cfg.ca>_cert_file_path)))
Can we fix this in main?
d
Thanks @Lukas Bommes for the issue, the additional info and persevering through this. An upcoming SIG auth should allow us to define goals around improving the auth experience in Flyte
l
That's good news. As of now, I just monkeypatched the`load_cert`function. It works for me, but it would be nice to have this permantely fixed.
e
I fixed the problem by providing my root CA file as admin.caCertFilePath in the flyte config.
hey @Lukas Bommes where did you get the root CA file from? I’ve tried with the output of
Copy code
kubectl get cm kube-root-ca.crt -n flyte -o jsonpath="{['data']['ca\.crt']}"
but I still get
Peer name <HOST> is not in peer certificate
Also
I configured TLS for my Flyte endpoint which works all fine in the browser and with flytectl
What configuration did you use? Something like:
Copy code
tls:
  - hosts:
    - <http://www.example.com|www.example.com>
    secretName: example-tls
I’ve: • enabled
common.ingress.tls
in values-gcp.yml • fetched the related cert with
kubectl get secret flyte-flyte-tls -n flyte -o jsonpath="{['data']['tls\.crt']}" | base64 --decode > tls.crt
• added the path of the cert in
admin.caCertFilePath
in my local flyte config file Still running
pyflyte run --remote
leads to a new error like the one you had
Handshake failed with fatal error SSL_ERROR_SSL: error:1000007d:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED.
d
@Enrico Rotundo what controller are you using for Ingress? Also, I don't think that K8s secret is the cert you need here. It should a cert that a. its Principal or SAN names includes the
host
name that points to your Ingress b. Is signed by a trusted authority. In such case, you should be able to get the certificate chain (root+ intermediates+ issued cert) and instruct
flyteadmin
to use it by specifying it in the
config.yaml
Otherwise, it'd be better for now to use
insecure: true
and disable
insecureSkipVeryfy
e
I’ve just realized this https://github.com/flyteorg/flyte/issues/3730#issuecomment-1586190847 so I’ll try again with the latest instructions 🤞
185 Views