We ve recently made an update to our flyte setup to use self Flyte #flyte-support

We've recently made an update to our flyte setup t...

abundant-judge-84756

04/03/2025, 3:12 PM

We've recently made an update to our flyte setup to use self-hosted agents for running more of our tasks, particularly because we often need to run thousands of the same type of task (eg. querying a database) and it feels more efficient running these from a centralised agent service. Since making this change, we've been trying to process a large number of workflows at scale (eg. 15K concurrent workflows) but are seeing very degraded performance. Some tasks spend multiple hours in a 'queued' state even when there are agents available for the tasks to be scheduled onto. We're having trouble isolating whether this is related to our agent configuration, or something unrelated around flytepropeller struggling with the load. We're running a self-hosted flyte cluster using the flyte-core helm chart. Are there any common gotchas or config settings we should look for in terms of being able to handle scheduling many concurrent agent tasks?

freezing-airport-6809

04/03/2025, 3:41 PM

We consistently run 15k workflow completions per second on a single propeller

freezing-airport-6809

04/03/2025, 3:41 PM

It does need specific tweaking

abundant-judge-84756

04/03/2025, 3:48 PM

Is the tweaking on the propeller side, or on the agent (/connector) side? ie. is there some sort of limitation on network requests that agents can receive? If we run

pyflyte serve agent

in a container running on a well provisioned k8s pod is it expected to handle any traffic thrown at it by propeller, or are there specific settings we might need to change?

abundant-judge-84756

04/03/2025, 3:49 PM

For clarity, we're not using the

flyteagent

deployment that comes with flyte-core but a standalone k8s deployment that runs the agent server and handles MyAgentTask types.

freezing-airport-6809

04/03/2025, 5:24 PM

You should use asyncio

freezing-airport-6809

04/03/2025, 5:25 PM

And you should tweak propeller too

freezing-airport-6809

04/03/2025, 5:25 PM

So both

5 Views

Open in Slack

Previous Next