Hey Folks, question about queueing. Suppose I hav...
# flyte-support
r
Hey Folks, question about queueing. Suppose I have 5 CPUs and 10 workflows that each need 1 CPU. I launch all the workflows and they all start concurrently, but some obviously get queued because there are only 5 CPUs available. The Flyte WebUI shows the queued jobs as "Running" when they are actually (technically) in a "queued" or "pending" state... is there any way to get the "pending" report in the WebUI? I see this behavior in e.g. the Flyte Sandbox, am curious if the behavior changes in ... perhaps the non-single-binary install of Flyte?
any thoughts maybe @average-finland-92144 ? I see that in
map_task
that there are tasks in a
queued
state until they run
a
Hey @rapid-artist-48509 I think it has to do with the source of the phase that is reported to the UI. Flyte's execution engine dispatches the execution to the K8s API and, considering it left the propeller queue and has an assigned worker, it's considered to be in a "Running" phase even if it's Pending in K8s. Same goes for map_tasks where the
queued
state may refer to propeller's own queue and not the underlying K8s scheduler
Does that make sense?
r
oh got it, thank you! So, I believe I configured my domain/project so that it has a
task_resources.limits.cpu
of 5. Then I launch 10 workflows... doesn't that mean that some of the workflows / tasks should be queued by the Flyte Propeller then? I'm using single binary deployment-- does the Propeller workflow queueing still work here, or does it need the non-single-binary deployment? Maybe I need to double-check my domain/project limits... I can totally believe if the limits are actually very high (unlimited) then Propeller is just gonna throw everything at K8s and let K8s figure it out.
a
this is not related to the Helm chart you used to deploy Flyte. flytepropeller has an execution queue that is processed in response to certain events but, if it's reported as
queued
, it's because it's in the propeller queue, once it's dispatched to K8s the phase is considered to be
Running
Notice that
queued
is not a Kubernetes Pod phase, so once it leaves the propeller queue it's up to the K8s scheduler to accomodate the resources so the Pods get to completion. If any error arises, the execution goes back to the queue but
Pending
due to resources constraints is not a trigger there. I hope this is somewhat clear to you
r
thank you!
@average-finland-92144 do you have the new version of that link? it's now broken because it forwards to a union.ai landing page
yeah i cannot find the architecture page anymore on https://www.union.ai/docs/flyte/user-guide/
b
Hi @rapid-artist-48509. We caught up with more links. This one should work now as before: https://docs.flyte.org/en/latest/user_guide/concepts/component_architecture/flytepropeller_architecture.html#workqueue-workerpool
r
sweeet thank you @bulky-gold-93144 @average-finland-92144!