I'm trying to setup a stock flyte installation by ...
# flyte-deployment
s
I'm trying to setup a stock flyte installation by following the GCP documentation. I've managed to setup a private cluster, permissions, bucket, db, dns, cert and to install flyte via helm with a few small changes in
values-gcp.yaml
. All components are running and I can access flyteconsole already via GKE ingress. No auth enabled yet. But accessing flyteadmin from the outside fails with a
502
with a
failed_to_pick_backend
message in the logs. Accessing flyteadmin via portforwarding works fine though. Has anyone seen s.th. like this? @jeev @Justin Tyberg could this be related to health check/firewall config you were talking about? What's weird is that serving flyteconsole works but access to flyteadmin fails.
j
But accessing flyteadmin from the outside
do you mean using
flytectl
?
s
I didn't even try flytectl because flyteconsole also talks to flyteadmin and that fails. So only the frontend serving part of flyteconsole works.
j
please forgive me, as i’m still new to Flyte. but what url are you trying when you get the 502? i assume you have a GCP external load balancer serving the requests?
you MUST forgive my chicken scratch of a drawing, but here are two diagrams top - default installation using contour as ingress controller bottom - replace contour with GCP external HTTPS load balancer as ingress controller
the GCP load balancer will throw 502 if it can’t get to the backend service. for example, if the health check to the service fails
s
Yes I'm using the GCP external load balancer.
So any url routed to the flyteadmin svc in the ingress seems to cause the 502. https://flyte.domain/healthcheck and https://flyte.domain/api/v1/projects
While https://flyte.domain/console (routed to flyteconsole) is working
With this error though because the frontend tries to talk to flytedmin...
There's no auth enabled at all right now, so it's like the bottom part of your drawing but without IAP
j
from my browser, i can hit
/healthcheck
and get OK 200, but it’s an empty response?
what do the healthchecks for the LOAD BALANCER show?
because GCP creates GOOGLE health checks for each backend service on GKE
and it assumes
/
as the health check path
so if the load balancer thinks the backend service is unhealthy, it probably sends back a 502 to the client
i had to create a custom backend-config for flyteadmin
Copy code
apiVersion: <http://cloud.google.com/v1|cloud.google.com/v1>
kind: BackendConfig
metadata:
  name: bec-flyteadmin
  namespace: flyte
spec:
  healthCheck:
    type: HTTP
    requestPath: /healthcheck
  iap:
    enabled: true
    oauthclientCredentials:
      secretName: oauth-secret
so the HTTP load balancer needs to think your services are healthy
my load balancer thinks
flyteadmin:80
is healthy, but the grpc service
flyteadmin:81
is not
my flyte values calls out different backend configs for each service:port
Copy code
flyteadmin:
  deployRedoc: false
  replicaCount: 1
  serviceAccount:
    # -- If the service account is created by you, make this false, else a new service account will be created and the flyteadmin role will be added
    # you can change the name of this role
    create: false
  service:
    annotations:
      # Required for the ingress to properly route grpc traffic to grpc port
      <http://cloud.google.com/app-protocols|cloud.google.com/app-protocols>: '{"grpc":"HTTP2"}'
      <http://beta.cloud.google.com/backend-config|beta.cloud.google.com/backend-config>: '{"ports": {
        "80":"bec-flyteadmin",
        "81":"bec-flyteadmin-grpc",
        "87":"bec-default"
      }}'
      <http://cloud.google.com/neg|cloud.google.com/neg>: '{"ingress": true}'
    type: ClusterIP
from my browser, i can hit
/api/v1/projects
Copy code
{
  "projects": [
    {
      "id": "flytedefault",
      "name": "flytedefault",
      "domains": [
        {
          "id": "dev",
          "name": "dev"
        }
      ],
      "description": "flytedefault description"
    }
  ]
}
s
Oh yes that could be it. Let me try to configure the health checks.
j
GKE will try to use readiness probes to create health checks, but there are none in the helm chart
also, it could get tricky with multiple services/containers in one Service
s
Where do I get the overview about which services the loadbalancer thinks are healthy? The one you pasted as image above
Lost in GCP cloud console...
j
message has been deleted
ingresses tab
select the flyte-core ingress. scroll to bottom
once you find it, as a quick test, you can manually update the path of the health check
s
Yes that was it! /healthcheck is giving me a 200 now after changing the path.
j
/api/v1/projects
should work too - from the browser
👍 1
s
Thanks a lot @Justin Tyberg! That was super helpful.
j
i’m still having a problem with grpc health check. and since
flytectl
uses
flyteadmin:81
, no workie for me
i think we’re stuck for the grpc health checks GCP: use only TCP for grpc health checks
• For backend services that use the gRPC protocol, use only gRPC or TCP health checks. Do not use HTTP(S) or HTTP/2 health checks.
GKE ingress: you can only use HTTP, HTTPS, or HTTP/2 for health checks
`PROTOCOL`: Specify a protocol used by probe systems for health checking. The 
BackendConfig
 only supports creating health checks using the HTTP, HTTPS, or HTTP2 protocols.
172 Views