Hello, we been running two instance of the flyte p...
# ask-the-community
l
Hello, we been running two instance of the flyte propeller but we noticed a disparity in the memory consumption between the two instances. Flyte 1.9.1 is installed on GKE via helm. Checking past 2 months, it seems like one of the propeller is underutilise or not utilised at all since it doesn't go up with load/time. The earlier metrics shows Flyte on 1.8.1 (before Sep 23) Some questions • is there something wrong that cause this uneven distribution? • it seem that the memory utilisation increases with time, are there any internal caching by the propeller that cause this behaviour? • are there autoscaling available? any other advice would be appreciated, thanks!
Copy code
k top pod
NAME                                  CPU(cores)   MEMORY(bytes)
flytepropeller-ddb88df5-4wltg         6m           758Mi
flytepropeller-ddb88df5-j8z4c         2m           67Mi
k
Did you run propeller manager
Propeller is leader elected and you cannot simply run multiple copies.
If you want to run multiple copies that is through sharding - which is all auto manager by propeller manager
Above graph shows single propeller mode
Also you should tweak some configs. Single propeller can handle 1000s of workflows per second
l
Did you run propeller manager
no, didnt know we have to 😅 , assumed it will be load balanced once we have replica > 1 like other components i'll test out the sharding, thanks for referring to the docs.
Also you should tweak some configs. Single propeller can handle 1000s of workflows per second
i checked the metrics but they don't show any anomaly
Copy code
flyte:propeller:all:free_workers_count
sum(rate(flyte:propeller:all:round:raw_ms[5m])) by (wf)
sum(rate(flyte:propeller:all:main_depth[5m]))
We probably have at most 20 workflows running at any time for now on a 1Gi mem propeller. We'll just increase the mem and monitor further. Thanks!
• it seem that the memory utilisation increases with time, are there any internal caching by the propeller that cause this behaviour?
with regards to this, i reduced the number of replica from 2 to 1, since the other propeller isn't in used anyway So the resource that the single replica is doubled, but the memory consumption drop drastically. The number of running workflow after deployment was even higher
k
Yes propeller maintains a lookaside style cache for many data items it works with
It will increase greedily and then maintain
If memory is what you were sharing I am not worried just add I would say 4GB and change cache in storage config to 2G or more and let it run with more workers like 400
Propellers will be fine
This is not airflow - it will scale really well
l
Got it. Are there any key benefit to running multiple propeller as there are more synchronisation involved? Otherwise we can stick to single propeller (like we did all these while)
k
Benefit is scale in lots of concurrent executions