Hi there, does anyone have any tips on debugging w...
# ask-the-community
d
Hi there, does anyone have any tips on debugging why workflows are stuck in queue indefinitely?
y
not an expert but have you tried
flytectl get execution -p flytesnacks -d development <execution_id> --details
k
in queue indefinitely? can you talk more?
b
I would check kubernetes resources used vs what the flyte project is allowed to consume.
y
@Ketan (kumare3) FYI previously I had a task queued indefinitely was due to resource exceeded quota
Copy code
└── start-node - SUCCEEDED - 2022-10-15 03:11:44.931890319 +0000 UTC - 2022-10-15 03:11:44.931890319 +0000 UTC
└── n0 - RUNNING - 2022-10-15 03:11:44.950847963 +0000 UTC - 2022-10-15 03:11:45.080042711 +0000 UTC
    └── Attempt :0
        └── Task - WAITING_FOR_RESOURCES - 2022-10-15 03:11:45.075151304 +0000 UTC - 2022-10-15 03:11:45.075151304 +0000 UTC
        └── Task Type - python-task
        └── Reason - Exceeded resourcequota: [BackOffError] The operation was attempted but failed, caused by: pods "feaeaf1c97c574fd0aa8-n0-0" is forbidden: exceeded quota: project-quota, requested: limits.memory=500Mi, used: limits.memory=3000Mi, limited: limits.memory=3000Mi
        └── Metadata
        │   ├── Generated Name : feaeaf1c97c574fd0aa8-n0-0
        │   ├── Plugin Identifier : container
        │   ├── External Resources
        │   ├── Resource Pool Info
        └── Logs :
k
i do agree that is the usual case
for some reason our default helm chart had a very restrictive quota
we have now removed it, but that is only recent
@Derek Yu you might want to do -
kubectl describe resourcequota
d
Thank you all for your help 🙂 and sorry for the lack of details. We dug into it and it turned out to be that we needed a larger node with more memory than the ones we had available to run the task.
152 Views