Hey team we deployed flyte to eks cluster now when...
# ask-the-community
a
Hey team we deployed flyte to eks cluster now when we are running it shows status unknown can anyone help me with it. Thanks!
Screenshot 2023-09-26 at 3.05.39 PM.png
d
HI @Anirudh Sridhar can you share: 1. Status of Flyte's pods
kubectl get po -n flyte
2. Status of tasks' Pods (assuming you're using the example workflow with no changes)
kubectl get po -n flytesnackes-development
If there's some component Pending or failed, could you run a
kubectl describe po <pod-name>
?
a
all are running
@David Espejo (he/him)
Copy code
kubectl get po -n flytesnacks-development
No resources found in flytesnacks-development namespace.
d
ok, are you running
flyte-binary
?
a
yes
Copy code
ubectl get po -n flyte
NAME                            READY   STATUS    RESTARTS       AGE
flyte-binary-65fff8bf4b-n86kt   1/1     Running   9 (156m ago)   3h22m
d
please get logs from the Flyte pod
kubectl logs -n flyte flyte-binary-65fff8bf4b-n86kt
a
Copy code
2023/09/26 10:44:25 /go/pkg/mod/gorm.io/gorm@v1.24.1-0.20221019064659-5dd2bb482755/callbacks.go:134 record not found
[0.566ms] [rows:0] SELECT * FROM "resources" WHERE resource_type = 'CLUSTER_RESOURCE' AND domain IN ('','staging') AND project IN ('','flytesnacks') AND workflow IN ('') AND launch_plan IN ('') ORDER BY priority desc,"resources"."id" LIMIT 1
{"json":{"src":"controller.go:297"},"level":"debug","msg":"syncing namespace [flytesnacks-staging]: ignoring unrecognized filetype [..2023_09_26_07_20_24.1701649631]","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:297"},"level":"debug","msg":"syncing namespace [flytesnacks-staging]: ignoring unrecognized filetype [..data]","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:477"},"level":"debug","msg":"successfully read template config file [001_namespace.yaml]","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:329"},"level":"debug","msg":"Attempting to create resource [Namespace] in cluster [] for namespace [flytesnacks-staging]","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:337"},"level":"debug","msg":"Type [Namespace] in namespace [flytesnacks-staging] already exists - attempting update instead","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:372"},"level":"info","msg":"Resource [Namespace] in namespace [flytesnacks-staging] is not modified","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:477"},"level":"debug","msg":"successfully read template config file [002_serviceaccount.yaml]","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:329"},"level":"debug","msg":"Attempting to create resource [ServiceAccount] in cluster [] for namespace [flytesnacks-staging]","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:337"},"level":"debug","msg":"Type [ServiceAccount] in namespace [flytesnacks-staging] already exists - attempting update instead","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:372"},"level":"info","msg":"Resource [ServiceAccount] in namespace [flytesnacks-staging] is not modified","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:607"},"level":"debug","msg":"Successfully created kubernetes resources for [flytesnacks-staging]","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:611"},"level":"info","msg":"Completed cluster resource creation loop for namespace [flytesnacks-staging] with stats: [{Created:0 Updated:0 AlreadyThere:2 Errored:0}]","ts":"2023-09-26T10:44:25Z"}

2023/09/26 10:44:25 /go/pkg/mod/gorm.io/gorm@v1.24.1-0.20221019064659-5dd2bb482755/callbacks.go:134 record not found
[0.611ms] [rows:0] SELECT * FROM "resources" WHERE resource_type = 'CLUSTER_RESOURCE' AND domain IN ('','production') AND project IN ('','flytesnacks') AND workflow IN ('') AND launch_plan IN ('') ORDER BY priority desc,"resources"."id" LIMIT 1
{"json":{"src":"controller.go:297"},"level":"debug","msg":"syncing namespace [flytesnacks-production]: ignoring unrecognized filetype [..2023_09_26_07_20_24.1701649631]","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:297"},"level":"debug","msg":"syncing namespace [flytesnacks-production]: ignoring unrecognized filetype [..data]","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:477"},"level":"debug","msg":"successfully read template config file [001_namespace.yaml]","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:329"},"level":"debug","msg":"Attempting to create resource [Namespace] in cluster [] for namespace [flytesnacks-production]","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:337"},"level":"debug","msg":"Type [Namespace] in namespace [flytesnacks-production] already exists - attempting update instead","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:372"},"level":"info","msg":"Resource [Namespace] in namespace [flytesnacks-production] is not modified","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:477"},"level":"debug","msg":"successfully read template config file [002_serviceaccount.yaml]","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:329"},"level":"debug","msg":"Attempting to create resource [ServiceAccount] in cluster [] for namespace [flytesnacks-production]","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:337"},"level":"debug","msg":"Type [ServiceAccount] in namespace [flytesnacks-production] already exists - attempting update instead","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:372"},"level":"info","msg":"Resource [ServiceAccount] in namespace [flytesnacks-production] is not modified","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:607"},"level":"debug","msg":"Successfully created kubernetes resources for [flytesnacks-production]","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:611"},"level":"info","msg":"Completed cluster resource creation loop for namespace [flytesnacks-production] with stats: [{Created:0 Updated:0 AlreadyThere:2 Errored:0}]","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:615"},"level":"info","msg":"Completed cluster resource creation loop with stats: [{Created:0 Updated:0 AlreadyThere:6 Errored:0}]","ts":"2023-09-26T10:44:25Z"}
{"json":{"src":"controller.go:633"},"level":"info","msg":"Successfully completed cluster resource creation loop","ts":"2023-09-26T10:44:25Z"}

2023/09/26 10:44:26 /go/pkg/mod/gorm.io/gorm@v1.24.1-0.20221019064659-5dd2bb482755/callbacks.go:134
[0.586ms] [rows:0] SELECT * FROM "schedulable_entities"
{"json":{"src":"controller.go:160"},"level":"info","msg":"==\u003e Enqueueing workflow [flytesnacks-development/f678b4cd6aab741aba5b]","ts":"2023-09-26T10:44:26Z"}
{"json":{"src":"controller.go:160"},"level":"info","msg":"==\u003e Enqueueing workflow [flytesnacks-development/fac0ae320b8614b889e6]","ts":"2023-09-26T10:44:26Z"}
{"json":{"src":"controller.go:160"},"level":"info","msg":"==\u003e Enqueueing workflow [flytesnacks-development/fe884109a2062487185a]","ts":"2023-09-26T10:44:26Z"}
{"json":{"src":"controller.go:160"},"level":"info","msg":"==\u003e Enqueueing workflow [flytesnacks-development/arqddn7q6q99k6ltglwb]","ts":"2023-09-26T10:44:26Z"}
{"json":{"src":"controller.go:160"},"level":"info","msg":"==\u003e Enqueueing workflow [flytesnacks-development/fe884109a2062487185a]","ts":"2023-09-26T10:44:36Z"}
{"json":{"src":"controller.go:160"},"level":"info","msg":"==\u003e Enqueueing workflow [flytesnacks-development/arqddn7q6q99k6ltglwb]","ts":"2023-09-26T10:44:36Z"}
{"json":{"src":"controller.go:160"},"level":"info","msg":"==\u003e Enqueueing workflow [flytesnacks-development/f678b4cd6aab741aba5b]","ts":"2023-09-26T10:44:36Z"}
{"json":{"src":"controller.go:160"},"level":"info","msg":"==\u003e Enqueueing workflow [flytesnacks-development/fac0ae320b8614b889e6]","ts":"2023-09-26T10:44:36Z"}
{"json":{"src":"controller.go:160"},"level":"info","msg":"==\u003e Enqueueing workflow [flytesnacks-development/f678b4cd6aab741aba5b]","ts":"2023-09-26T10:44:46Z"}
{"json":{"src":"controller.go:160"},"level":"info","msg":"==\u003e Enqueueing workflow [flytesnacks-development/fac0ae320b8614b889e6]","ts":"2023-09-26T10:44:46Z"}
{"json":{"src":"controller.go:160"},"level":"info","msg":"==\u003e Enqueueing workflow [flytesnacks-development/fe884109a2062487185a]","ts":"2023-09-26T10:44:46Z"}
{"json":{"src":"controller.go:160"},"level":"info","msg":"==\u003e Enqueueing workflow [flytesnacks-development/arqddn7q6q99k6ltglwb]","ts":"2023-09-26T10:44:46Z"}
do i need to create
flytesnacks-development
namespace?
also i checked the wf is pushed to s3
d
do i need to create
flytesnacks-development
namespace?
no, this one is created as one of the
initialProjects
The Pod has been restarted by K8s 9 times, there has to be something in logs. Can you also do a
kubectl describe po -n flyte flyte-binary-65fff8bf4b-n86kt
what do u want to have a look in the describe thing
also we got some error becoz of ray which got resolved once we installed kuberay
i dont know whats the issue as everything looks fine
d
can you verify that the SA has been annotated?
kubectl describe sa -n flyte flyte-binary
so there are no Task Pods being created?
kubectl get po -A
a
no
Copy code
flyte         flyte-binary-65fff8bf4b-n86kt                   1/1     Running   9 (173m ago)   3h39m
kube-system   aws-cloudwatch-metrics-6hvr5                    1/1     Running   1 (21h ago)    21h
kube-system   aws-cloudwatch-metrics-cgb8v                    1/1     Running   0              21h
kube-system   aws-cloudwatch-metrics-zlb9n                    1/1     Running   0              21h
kube-system   aws-cluster-autoscaler-7878b6fcd5-mpjr2         1/1     Running   0              4h22m
kube-system   aws-load-balancer-controller-68d5864d5d-hjshh   1/1     Running   0              4h22m
kube-system   aws-load-balancer-controller-68d5864d5d-s74t2   1/1     Running   0              4h22m
kube-system   aws-node-5qgqz                                  1/1     Running   0              21h
kube-system   aws-node-8xpxn                                  1/1     Running   0              21h
kube-system   aws-node-cnsbt                                  1/1     Running   0              21h
kube-system   coredns-6bc4667bcc-g82mk                        1/1     Running   0              21h
kube-system   coredns-6bc4667bcc-w694l                        1/1     Running   0              21h
kube-system   external-dns-dbd54cbcb-x5vh4                    1/1     Running   0              21h
kube-system   kube-proxy-bmn98                                1/1     Running   0              21h
kube-system   kube-proxy-gwcsd                                1/1     Running   0              21h
kube-system   kube-proxy-ms8rr                                1/1     Running   0              21h
ray-system    kuberay-apiserver-bdb679dc-n9qg4                1/1     Running   0              3h31m
ray-system    kuberay-operator-dc7968898-qw8vq                1/1     Running   0              3h31m
@David Espejo (he/him)?
d
sorry, I had to drop (take the kids to school). Well, this is a strange behavior, can you check logs from the previous Pod execution?
kubectl logs -n flyte flyte-binary-65fff8bf4b-n86kt --previous
I've seen this behavior (
UNKNOWN
status) but every time it's been a K8s scheduling problem and I saw the Task Pods in
Pending
state. But in your env there's no Pod being created, so to me it looks more like a permissions issue somewhere, I think we can have a screen share session to debug every layer and isolate the problem
a
Ya sure