Anyone familiar with the following attachment any good sugge Flyte #flyte-support

Anyone familiar with the following? (attachment) a...

refined-doctor-1380

07/13/2023, 2:34 PM

Anyone familiar with the following? (attachment) any good suggestion?

Copy code

7/13/2023 9:16:22 AM UTC task submitted to K8s

7/13/2023 9:16:22 AM UTC Unschedulable:0/1 nodes are available: 1 Insufficient cpu. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.

microscopic-furniture-57275

07/13/2023, 2:41 PM

Hi @refined-doctor-1380, this happens when your kubernetes cluster is unable to schedule the task/POD because (in your case) the cpu request can't be met (there are no nodes with sufficiently free CPU to accommodate the task). The "no preemption victims" means it's unable to find processes on the node to kill to make room. You can see that the task is queued -- it will remain queued until the resource request for the task can be satisfied. How this "lack of resources" situation is managed, e.g. if a new node can be spun up, is determined by how you manage your k8s cluster. You might find some leads in this post which speaks to the message you are seeing.

gratitude thank you 2

refined-doctor-1380

07/13/2023, 3:51 PM

Hi @microscopic-furniture-57275 Thanks for the hint. here is what I found, looks like the pods in the dev are all pending. However, the cpu/mem usage looks fine. do you think it may be related to the pod priority?

microscopic-furniture-57275

07/13/2023, 4:09 PM

If you (d)escribe the pods there, you can look at what the cpu and memory requests/limits are, and further down you can see any messages about attempts to schedule the pod. You can also use k9s to look at the node(s) that are currently running in your cluster, and from there look at what things are already running on the node and how much CPU/mem is being used.

refined-doctor-1380

07/13/2023, 4:36 PM

wow, thanks it still

Warning  FailedScheduling  60m   default-scheduler  0/1 nodes are available: 1 Insufficient cpu. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod.

refined-doctor-1380

07/13/2023, 4:37 PM

it weird, the cpu looks fine

microscopic-furniture-57275

07/13/2023, 4:42 PM

In your k9s screenshot, you are looking at pods. Try looking at nodes, and then (d)escribe the node or nodes you see. You will be able to see how much cpu each has, how much is allocatable, and what other things are running. It seems to be the case that your CPU requests by your tasks are > than the CPU that is allocatable by your node(s).

refined-doctor-1380

07/13/2023, 4:48 PM

Hi @microscopic-furniture-57275 I am afraid I don’t know what does the node means exactly. Can you be more specific about it?

microscopic-furniture-57275

07/13/2023, 4:53 PM

Hey @refined-doctor-1380, I am @microscopic-furniture-57275 🙂 A node is a computer in your k8s cluster. If you are running a demo setup on a single computer, your cluster may have only one node; a k8s cluster may have many nodes. Using k9s, you can select the type of resource to view, press ':' and then "node". Once you have highlighted a node, press "d" to get a description. It will tell you lots of statistics about this node. Use the arrow keys to scroll down and read everything. In particular, look for the CPU allocatable. Similarly you can use k9s to view pods like you already did, press ":" and then "pods". Scroll using arrow keys to a pod, and press "d" to describe. You can read there how much cpu/memory the task/pod is requesting. You will probably find that the CPU being requested is larger than the CPU that is allocateable. You can fix this by either adding another node to your cluster, freeing up resources on the existing node, or making the cpu request of your task smaller.

refined-doctor-1380

07/13/2023, 4:58 PM

Sorry for the typo @microscopic-furniture-57275.

refined-doctor-1380

07/13/2023, 5:03 PM

Hi @microscopic-furniture-57275., looks like some node are unable to use

microscopic-furniture-57275

07/13/2023, 5:09 PM

Yes, I think you'll need to google some about your k8s/flyte setup to understand this (or someone else in the community here can chime in). I'm not too knowledgeable about this, but I know in some cases some nodes are provisioned to handle specific tasks, and other nodes are available to schedule workflows on. And in any case, you can look at the cpu requests for the pods that are not being scheduled to see what the request is, and perhaps reduce that to cause it to get scheduled.

refined-doctor-1380

07/13/2023, 5:11 PM

thanks, I will google it, thanks for help me out here. it’s really helpful.

refined-doctor-1380

07/13/2023, 5:11 PM

@microscopic-furniture-57275 really appreciate your help

👍 1

77 Views

Open in Slack

Previous Next