https://flyte.org logo
#ask-the-community
Title
# ask-the-community
q

Quentin Chenevier

09/27/2023, 8:37 PM
Hi all, I'm very happy with the flyte community which seems to be very active. 🙂 I'm trying to deploy flyte-binary on EKS using the terraform template provided here: https://github.com/unionai-oss/deploy-flyte/tree/main/environments/aws by @David Espejo (and some fixes from @Uria Franko ). The flyte-binary pod seems to be starting and I don't see any critical error in the logs, however the pod is restarted (
CrashLoopBackoff
) because the liveness & readiness probes fail to communicate with it. E.g:
Liveness probe failed: Get "<http://10.3.125.74:8088/healthcheck>": dial tcp 10.3.125.74:8088: connect: connection refused
Has anyone else witnessed the same behavior ? Or may I have misconfigured either the terraform template or the helm chart ?
y

Yee

09/27/2023, 8:55 PM
get the logs
-p
?
q

Quentin Chenevier

09/27/2023, 8:59 PM
Here are the logs I get with
kubectl logs flyte-binary-b958c88b6-mh2nh -n flyte -c flyte
The thing I'm investigating right now is that I'm having doubts about the values I've put there for OIDC in the helm configuration: https://github.com/flyteorg/flyte/blob/master/charts/flyte-binary/eks-production.yaml#L23-L29
y

Yee

09/27/2023, 9:04 PM
-p
?
well
get pod
first… want to see if there are restarts.
and if there are restarts, then
-p
to see the previous logs
q

Quentin Chenevier

09/27/2023, 9:06 PM
haaaaaa I understand now why the previous logs are interesting to watch. Smart.
logs
(I forgot to say that there are restarts indeed: since the liveness & readiness probes fail to connect the pod is restarted)
d

David Espejo (he/him)

09/27/2023, 9:18 PM
@Quentin Chenevier is that the `baseUrl`that you're using? bc that won't work in your environment
q

Quentin Chenevier

09/27/2023, 9:20 PM
Nope. I've put the OIDC Connect provider url shown on the cluster page in EKS console.
y

Yee

09/27/2023, 9:20 PM
we need to be better about errors
that error is a terminating error…
can you make it so that that error goes away
unf. in the single binary we don’t capture the error and restart. then it would be obvious
the thread just dies
😞
q

Quentin Chenevier

09/27/2023, 9:22 PM
Haaaaa so I've misconfigured the thing: OIDC is not found
It works !
Since I'm very new to kube, I'm not very used to digg into logs. Thanks for helping finding the root cause. (and I guess it's time for me to go to sleep now).
y

Yee

09/27/2023, 9:27 PM
sure let us know…
we definitely recommend doing auth last
as it’s the trickiest bit to get right
q

Quentin Chenevier

09/27/2023, 9:28 PM
What was strange was seeing the pod as
Running
, I thought that if the probes couldn't reach it was due to a networking issue. But indeed the pod wasn't really running.
@Yee Yeah thanks for the tip. I'm learning on the way. 😅
y

Yee

09/27/2023, 9:30 PM
yeah we need to add a panic to serve probably and recover in the go routine https://github.com/flyteorg/flyte/blob/e57cac0990fe5ec321e590fc49147014827e6dfd/cmd/single/start.go#L95
2 Views