My flyte-scheduler pod suddenly started failing to...
# flyte-support
s
My flyte-scheduler pod suddenly started failing to start. Upon inspecting logs, I get the following message
Copy code
time="2025-03-24T22:51:56Z" level=info msg="Using config file: [/etc/flyte/config/admin.yaml /etc/flyte/config/db.yaml /etc/flyte/config/server.yaml]"
Error: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 10.105.175.147:81: connect: connection refused"
It seems to me like flyte-scheduler is failing to communicate with flyte-admin based on this message, which is weird because flyteadmin pod is running just fine. I also noticed that the cluster IP mentioned in the flyte-scheduler error doesn't match the IP asigned to the flyteadmin pod in the cluster, anyone know why this might happen?
c
Seems like a service discovery issue if you have a stale IP address.
f
hmmm, service discovery in k8s you mean or bad config?
c
I suppose it would be either
s
Thanks, looks like it was a stale ip