Hi, Im observing dynamic tasks stuck in Queued sta...
# ask-the-community
g
Hi, Im observing dynamic tasks stuck in Queued state on EKS deployment; I also see Parallelism:0 which might be the cause (because the base dynamic task is ‘Running’ and then any spawned tasks are not able to run since Parallelism=0)? How do I set it in
flyte-core
Helm deployment? Don’t see a a field for it in values here. Or should I set it somewhere else? https://github.com/flyteorg/flyte/blob/master/charts/flyte-core/values.yaml
y
i don’t think that should matter. 0 parallelism should mean no limit
it’s probably stuck for other reasons.
are the pods created?
describe the pods if they are, check propeller logs and grep for execution id if not
g
Thanks @Yee! Pods were not created and I found some errors in the flytepropeller logs:
service "flyte-binary-webhook" not found
Im using
flyte-core
not
flyte-binary
, also in none of the configmaps I can find something about `flyte-binary`:
k describe cm -n flyte | grep binary
is empty
Why is it looking for this service, and how do I point it somewhere else?
I do have
flyte-pod-webhook
Service
Its strange because it references both
flyte-pod-webhook
and
flyte-binary-webhook
:
Copy code
RuntimeExecutionError: failed during plugin execution, caused by: failed to execute handle for plugin [container]: [InternalError] failed to create resource, caused by: Internal error occurred: failed calling webhook "<http://flyte-pod-webhook.flyte.org|flyte-pod-webhook.flyte.org>": failed to call webhook: Post "<https://flyte-binary-webhook.flyte.svc:443/mutate--v1-pod?timeout=10s>": service "flyte-binary-webhook" not found
In flytepropeller configmap:
Copy code
webhook:
      certDir: /etc/webhook/certs
      serviceName: flyte-pod-webhook
y
the webhook is needed for some things yes. does your task have secrets?
g
Yeah it does, but why is it looking for binary?
When I'm using flyte-core
@Yee Found the cause, there was a leftover mutating webhook in the cluster from a previous test with `flyte-binary`; so even though we didn’t have any references to
flyte-binary
in our current setup, this mutating webhook was still triggering. However, there was no Service to back it, hence the error and tasks stuck in Queued state. The fix:
Copy code
kubectl delete <http://mutatingwebhookconfigurations.admissionregistration.k8s.io|mutatingwebhookconfigurations.admissionregistration.k8s.io> flyte-binary-webhook
y
ah nice.
thanks