Hi everyone. I am trying to deploy Flyte to EKS. I...
# ask-the-community
k
Hi everyone. I am trying to deploy Flyte to EKS. I am exposing my ingress through Cloudflare via externaldns. When I try to connect via pyflyte, I get this error:
Copy code
pyflyte run --remote basics/hello_world.py hello_world_wf                                                                                             
Running Execution on Remote.
Failed with Exception Code: SYSTEM:Unknown
RPC Failed, with Status: StatusCode.PERMISSION_DENIED
	details: Received http2 header with status: 403
	Debug string UNKNOWN:Error received from peer  {grpc_message:"Received http2 header with status: 403", grpc_status:7, created_time:"2024-01-29T16:44:06.737655+04:00"}
What could be the reason of this? Possibly Cloudflare SSL termination? I am using self-signed certificate for testing purposes
I have just checked that connection through port-forwarding works without issues
Ok, this is definetly some cloudflare-level issue since I can access flyte directly through ELB hostname by supplying correct HTTP Host header
d
Hi @Kirill Dubovikov and welcome to the Flyte community! In the past, some Flyte users have also received 403s from Cloudflare and have reported that enabling gRPC networking in the Cloudflare account has fixed the problem (thread) Does that help?
k
Hi @David Espejo (he/him). Thanks for the answer. This was indeed a cloudflare issue. SSL proxy was enabled on cloudflare front and that was causing a chain of redirects since SSL was enabled on Flyte’s ELB as well
For now, I have decided to deploy dashboard using internal ELB under VPN, works better this way
Although now there is another issue. I am able to create a project in my cluster like this:
Copy code
flytectl create project --id "poc" --name "poc"
However, I can’t run a workflow:
Copy code
pyflyte -v run --remote -p poc -d development ./workflows/poc.py wf
Running Execution on Remote.
E0130 17:02:15.506599000 7953469440 <http://ssl_transport_security.cc:1511]|ssl_transport_security.cc:1511]>    Handshake failed with fatal error SSL_ERROR_SSL: error:1000012e:SSL routines:OPENSSL_internal:KEY_USAGE_BIT_INCORRECT.
E0130 17:02:15.586829000 7953469440 <http://ssl_transport_security.cc:1511]|ssl_transport_security.cc:1511]>    Handshake failed with fatal error SSL_ERROR_SSL: error:1000012e:SSL routines:OPENSSL_internal:KEY_USAGE_BIT_INCORRECT.
E0130 17:02:15.667737000 7953469440 <http://ssl_transport_security.cc:1511]|ssl_transport_security.cc:1511]>    Handshake failed with fatal error SSL_ERROR_SSL: error:1000012e:SSL routines:OPENSSL_internal:KEY_USAGE_BIT_INCORRECT.
My ELB config is the following:
Copy code
ingressClassName: alb
  commonAnnotations:
    <http://alb.ingress.kubernetes.io/certificate-arn|alb.ingress.kubernetes.io/certificate-arn>: 'XXXX'
    <http://alb.ingress.kubernetes.io/group.name|alb.ingress.kubernetes.io/group.name>: flyte
    <http://alb.ingress.kubernetes.io/listen-ports|alb.ingress.kubernetes.io/listen-ports>: '[{"HTTP": 80}, {"HTTPS":443}]'
    <http://alb.ingress.kubernetes.io/scheme|alb.ingress.kubernetes.io/scheme>: internal
    <http://alb.ingress.kubernetes.io/ssl-redirect|alb.ingress.kubernetes.io/ssl-redirect>: '443'
    <http://alb.ingress.kubernetes.io/target-type|alb.ingress.kubernetes.io/target-type>: ip
  httpAnnotations:
    <http://alb.ingress.kubernetes.io/actions.app-root|alb.ingress.kubernetes.io/actions.app-root>: '{"Type": "redirect", "RedirectConfig": {"Path": "/console", "StatusCode": "HTTP_302"}}'
  grpcAnnotations:
    <http://alb.ingress.kubernetes.io/backend-protocol-version|alb.ingress.kubernetes.io/backend-protocol-version>: GRPC 
  # host: none #replace with your fully-qualified domain name
openssl s_client -connect
is working successfully for me. Verification fails, but that’s expected since I am using self-signed certificate
I am also trying to enable insecure internal ELB deployment with this config. I am able to access console, but my GRPC API via flyte client calls fail
Copy code
ingressClassName: alb
  commonAnnotations:
    <http://alb.ingress.kubernetes.io/group.name|alb.ingress.kubernetes.io/group.name>: flyte
    <http://alb.ingress.kubernetes.io/scheme|alb.ingress.kubernetes.io/scheme>: internal
    <http://alb.ingress.kubernetes.io/target-type|alb.ingress.kubernetes.io/target-type>: ip
  httpAnnotations:
    <http://alb.ingress.kubernetes.io/listen-ports|alb.ingress.kubernetes.io/listen-ports>: '[{"HTTP": 8080}]'
    <http://alb.ingress.kubernetes.io/actions.app-root|alb.ingress.kubernetes.io/actions.app-root>: '{"Type": "redirect", "RedirectConfig": {"Path": "/console", "StatusCode": "HTTP_302"}}'
  grpcAnnotations:
    <http://alb.ingress.kubernetes.io/listen-ports|alb.ingress.kubernetes.io/listen-ports>: '[{"HTTP": 8089}]'
    <http://alb.ingress.kubernetes.io/backend-protocol-version|alb.ingress.kubernetes.io/backend-protocol-version>: GRPC
d
can you share the content of of you
$HOME/.flyte/config.yaml
file? when using a self-signed cert you should set
insecure: true
k
Yes, @David Espejo (he/him), that’s what I am using.
Copy code
admin:
  # For GRPC endpoints you might want to use dns:///flyte.myexample.com
  endpoint: dns:///internal-k8s-flyte-XXX-XXX.ap-XXX-1.elb.amazonaws.com
  authType: Pkce
  insecure: true
  insecureSkipVerify: true
logger:
  show-source: true
  level: 6
d
set
insecureSkipVerify: false
, it doesn't do much when insecure is true
k
A parallel question: I have re-elabled SSL, but since I am using internal ELB it’s sort of impossible for me to create a valid certificate for it. Despite insecureSkipVerify`: true`, I am getting this error when running
pyflyte run
Copy code
RPC Failed, with Status: StatusCode.UNAVAILABLE
	details: failed to connect to all addresses; last error: UNKNOWN: ipv4:11.11.11.11:443: Peer name internal-k8s-flyte-268cef442b-2123384988.ap-south-1.elb.amazonaws.com is not in peer certificate
I’m stuck in a loop: • To deploy flyte on internal AWS ELB, you need an HTTPS endpoint • To have a HTTPS listener, you need a certificate • I can’t create a valid verified certificate since its an internal ELB without fixed domain name •
flytectl
works with invalid certificates without issue •
pflyte
sends certificate validation errors despite verification being disabled
I guess I can try setting some env variable to disable verification on GRPC client side
d
internal ELB without fixed domain name
I'm not AWS expert for sure but do you mean the hostname of the ELB instance eventually changes even if it's not redeployed? You don't need an A record on a managed zone to work with Flyte, you can use the ELB hostname and probably trick your OS with a local DNS entry pointing to your ELB hostname
k
you can use the ELB hostname and probably trick your OS with a local DNS entry pointing to your ELB hostname
Yes, but that would make setup tricky for other team members
d
oh, agree
so, let me recap, who's terminating the SSL connection?
k
so, let me recap, who’s terminating the SSL connection?
Currently noone, since I have opted for internal VPC deployment instead of exposing Flyte via public DNS
The setup is like this
Copy code
[Client] -> [VPN] -> {Internal AWS ELB} -> [VPC] -> [K8S] -> {Flyte binary]
When trying to use self-signed cert I get
Copy code
details: failed to connect to all addresses; last error: UNKNOWN: ipv4:XX.XX.XX.XX:443: Peer name <http://internal-k8s-flyte-XXXX-XXXX.ap-south-1.elb.amazonaws.com|internal-k8s-flyte-XXXX-XXXX.ap-south-1.elb.amazonaws.com> is not in peer certificate
d
ok and I guess the self-signed cert uses the ELB host name as the CN?
k
ok and I guess the self-signed cert uses the ELB host name as the CN?
I can not do this, since the hostname is way larger than 64 symbols
However, I did try making it a wildcard. No effect, unfortunately
Looks like there is a way to override this, but it requires patching the client code: https://github.com/grpc/grpc/issues/22119
d
let's try this
insecure: false
insecureSkipVerify: true
k
Yes, I have changed the config appropriately before testing:
Copy code
admin:
  # For GRPC endpoints you might want to use dns:///flyte.myexample.com
  endpoint: dns:///internal-k8s-flyte-XXXX-XXXX.ap-south-1.elb.amazonaws.com
  authType: Pkce
  insecure: false
  insecureSkipVerify: true
logger:
  show-source: true
  level: 6
I have resolved my issue by creating an internal route53 zone that provides a domain name that matches certificate CN