Hi, we're trying to deploy flyte via the flyte-bin...
# ask-the-community
Hi, we're trying to deploy flyte via the flyte-binary helm chart in an EKS cluster. So far, we connected to our RDS database and tried to connect to our Keycloak instance and now we're getting a fatal error:
Flyte native scheduler failed to start due to async future was canceled
. The
are written to secret and used via
. We can't see any relevant logs in Keycloak, so the requests don't seem to arrive. When I disable auth, everything seems to be working well, I've also tested the client credentials for the client and they are working fine (and I see the manual requests in the Keycloak logs). Here is the relevant section in the values:
Copy code
  enabled: true
  enableAuthServer: true
    baseUrl: keycloak-svc.keycloak/auth/realms/test
    clientId: flyte
    - <https://flyte.example.com>
    clientSecretHash: {{hash of client secret in client_secrets}}
  clientSecretsExternalSecretRef: client-secrets
I found this this thread in the flyte forum, but the suggestions aren't helping so far (haven't found the traefik settings for proxy buffer size).
@Luis Dunkum There's a need to update auth docs for Keycloak. But in order to confirm, I wanted to see what's the behaviour in your env when: • You use
as the structure for the baseUrl • Also, what are you using for Ingress? What do you plan to use for SSL?
• I added
to the baseUrl and the message in question disappears, but I can't see any flyte logs in Keycloak and the pod is still failing, so there may be some other problem as well. • We are using traefik for ingress and SSL. Once I disable auth, I can navigate to the dashboard just fine and also connect via CLI. We checked flyte-the-hard-way, but sadly there are lots of differences in the stack used, so some things aren't that relevant to us.
Ah sorry, I think I was mistaken, the error message disappeared but apparently only because Keycloak can't be reached now, see here:
Copy code
Error creating auth context [AUTH_CONTEXT_SETUP_FAILED] Error creating oidc provider w/ issuer [<https://keycloak-svc.keycloak/auth/realms/test>], caused by: Get \"<https://keycloak-svc.keycloak/auth/realms/test/.well-known/openid-configuration>\": context deadline exceeded
@Luis Dunkum apparently in recent versions of Keycloak, the baseURL doesn't use
anymore. That's what I mean with using this structure:
It comes from this thread https://flyte-org.slack.com/archives/CP2HDHKE1/p1693218949341099
Aah, sorry, my mistake, I'll take a look!
Hmm, it seems as if there is still a problem with the connection. There are some more log entries, but I still can't see any requests for the flyte client in Keycloak.
Copy code
flyte {"json":{"src":"client.go:63"},"level":"info","msg":"Initialized Admin client","ts":"2023-11-15T10:54:55Z"}
flyte {"json":{"src":"start.go:54"},"level":"info","msg":"Successfully initialized a native flyte scheduler","ts":"2023-11-15T10:54:55Z"}
flyte {"json":{"src":"schedule_executor.go:38"},"level":"info","msg":"Flyte native scheduler started successfully","ts":"2023-11-15T10:54:55Z"}
flyte {"json":{"src":"schedule_executor.go:58"},"level":"info","msg":"Number of schedules retrieved 0","ts":"2023-11-15T10:54:55Z"}
flyte {"json":{"src":"schedule_executor.go:88"},"level":"error","msg":"failed to get future value for catchup due to async future was canceled","ts":"2023-11-15T10:54:55Z"}
flyte {"json":{"src":"schedule_executor.go:89"},"level":"info","msg":"Flyte native scheduler shutdown","ts":"2023-11-15T10:54:55Z"}
flyte {"json":{"src":"start.go:58"},"level":"fatal","msg":"Flyte native scheduler failed to start due to async future was canceled","ts":"2023-11-15T10:54:55Z"}
anything interesting in the Traefik logs?
I doubt it, since this is all still cluster internal traffic, as
points to the Keycloak service inside the cluster, but I'll check