s

    Sören Brunk

    7 months ago
    I'm trying to setup a stock flyte installation by following the GCP documentation. I've managed to setup a private cluster, permissions, bucket, db, dns, cert and to install flyte via helm with a few small changes in
    values-gcp.yaml
    . All components are running and I can access flyteconsole already via GKE ingress. No auth enabled yet. But accessing flyteadmin from the outside fails with a
    502
    with a
    failed_to_pick_backend
    message in the logs. Accessing flyteadmin via portforwarding works fine though. Has anyone seen s.th. like this? @jeev @Justin Tyberg could this be related to health check/firewall config you were talking about? What's weird is that serving flyteconsole works but access to flyteadmin fails.
    Justin Tyberg

    Justin Tyberg

    7 months ago
    But accessing flyteadmin from the outside
    do you mean using
    flytectl
    ?
    s

    Sören Brunk

    7 months ago
    I didn't even try flytectl because flyteconsole also talks to flyteadmin and that fails. So only the frontend serving part of flyteconsole works.
    Justin Tyberg

    Justin Tyberg

    7 months ago
    please forgive me, as i’m still new to Flyte. but what url are you trying when you get the 502? i assume you have a GCP external load balancer serving the requests?
    you MUST forgive my chicken scratch of a drawing, but here are two diagrams top - default installation using contour as ingress controller bottom - replace contour with GCP external HTTPS load balancer as ingress controller
    the GCP load balancer will throw 502 if it can’t get to the backend service. for example, if the health check to the service fails
    s

    Sören Brunk

    7 months ago
    Yes I'm using the GCP external load balancer.
    So any url routed to the flyteadmin svc in the ingress seems to cause the 502. https://flyte.domain/healthcheck and https://flyte.domain/api/v1/projects
    While https://flyte.domain/console (routed to flyteconsole) is working
    With this error though because the frontend tries to talk to flytedmin...
    There's no auth enabled at all right now, so it's like the bottom part of your drawing but without IAP
    Justin Tyberg

    Justin Tyberg

    7 months ago
    from my browser, i can hit
    /healthcheck
    and get OK 200, but it’s an empty response?
    what do the healthchecks for the LOAD BALANCER show?
    because GCP creates GOOGLE health checks for each backend service on GKE
    and it assumes
    /
    as the health check path
    so if the load balancer thinks the backend service is unhealthy, it probably sends back a 502 to the client
    i had to create a custom backend-config for flyteadmin
    apiVersion: <http://cloud.google.com/v1|cloud.google.com/v1>
    kind: BackendConfig
    metadata:
      name: bec-flyteadmin
      namespace: flyte
    spec:
      healthCheck:
        type: HTTP
        requestPath: /healthcheck
      iap:
        enabled: true
        oauthclientCredentials:
          secretName: oauth-secret
    so the HTTP load balancer needs to think your services are healthy
    my load balancer thinks
    flyteadmin:80
    is healthy, but the grpc service
    flyteadmin:81
    is not
    my flyte values calls out different backend configs for each service😛ort
    flyteadmin:
      deployRedoc: false
      replicaCount: 1
      serviceAccount:
        # -- If the service account is created by you, make this false, else a new service account will be created and the flyteadmin role will be added
        # you can change the name of this role
        create: false
      service:
        annotations:
          # Required for the ingress to properly route grpc traffic to grpc port
          <http://cloud.google.com/app-protocols|cloud.google.com/app-protocols>: '{"grpc":"HTTP2"}'
          <http://beta.cloud.google.com/backend-config|beta.cloud.google.com/backend-config>: '{"ports": {
            "80":"bec-flyteadmin",
            "81":"bec-flyteadmin-grpc",
            "87":"bec-default"
          }}'
          <http://cloud.google.com/neg|cloud.google.com/neg>: '{"ingress": true}'
        type: ClusterIP
    from my browser, i can hit
    /api/v1/projects
    {
      "projects": [
        {
          "id": "flytedefault",
          "name": "flytedefault",
          "domains": [
            {
              "id": "dev",
              "name": "dev"
            }
          ],
          "description": "flytedefault description"
        }
      ]
    }
    s

    Sören Brunk

    7 months ago
    Oh yes that could be it. Let me try to configure the health checks.
    Justin Tyberg

    Justin Tyberg

    7 months ago
    GKE will try to use readiness probes to create health checks, but there are none in the helm chart
    also, it could get tricky with multiple services/containers in one Service
    s

    Sören Brunk

    7 months ago
    Where do I get the overview about which services the loadbalancer thinks are healthy? The one you pasted as image above
    Lost in GCP cloud console...
    Justin Tyberg

    Justin Tyberg

    7 months ago
    ingresses tab
    select the flyte-core ingress. scroll to bottom
    once you find it, as a quick test, you can manually update the path of the health check
    s

    Sören Brunk

    7 months ago
    Yes that was it! /healthcheck is giving me a 200 now after changing the path.
    Justin Tyberg

    Justin Tyberg

    7 months ago
    /api/v1/projects
    should work too - from the browser
    s

    Sören Brunk

    7 months ago
    Thanks a lot @Justin Tyberg! That was super helpful.
    Justin Tyberg

    Justin Tyberg

    7 months ago
    i’m still having a problem with grpc health check. and since
    flytectl
    uses
    flyteadmin:81
    , no workie for me
    i think we’re stuck for the grpc health checks GCP: use only TCP for grpc health checks
    • For backend services that use the gRPC protocol, use only gRPC or TCP health checks. Do not use HTTP(S) or HTTP/2 health checks.
    GKE ingress: you can only use HTTP, HTTPS, or HTTP/2 for health checks
    PROTOCOL: Specify a protocol used by probe systems for health checking. The 
    BackendConfig
     only supports creating health checks using the HTTP, HTTPS, or HTTP2 protocols.