Hi, I’m trying to understand how Flyte works under...
# ask-the-community
r
Hi, I’m trying to understand how Flyte works under the hood to evaluate whether it can deliver the necessary performance for us. I have a deployment on EKS following the Single Cluster Simple Cloud Deployment guide and have executed some more simple workflows. Looking at FlytePropeller Architecture &

YT FlytePropeller Deep Dive

& Optimizing Performance I still can’t map it to what is running in my cluster. 1. What are the actual components running in my EKS cluster that represent FlyteAdmin & FlytePropeller & WorkQueue? There is one Pod
flyte-backend-flyte-binary-xxx
so that includes everything? and I can only scale everything together? 2.
"FlytePropeller can scale to 1000s of workers on a single CPU"
Worker is used a lot in regards to FlytePropeller but what is actually meant by that? An instantiation of one FlytePropeller aka Pod? A process as part of that
flyte-backend-flyte-binary
Pod? A node as part of the cluster? What is a worker and how can I observe what it is doing? 3. How is scaling of the cluster supposed to work? Assume I want to increase the number of concurrent tasks. How would I make sure that the cluster can handle it? Scaling out FlyteAdmin & Scaling out Datacatalog & Scaling out FlytePropeller does not describe what I actually need to change to make it work except deploying the “FlytePropeller Manager”. I don’t have a deployment in my cluster that is called “FlyteAdmin” or “Datacatalog” so I’m not sure what is meant by “Datacatalog is a stateless service and its replicas (in the kubernetes deployment) can be simply increased to allow higher throughput” Excuses in advance if these questions are trivial … . It would also help me if you can point me to some design documents or similar that I can read to answer my questions 🙂
t
There are 2 different types of flyte deployment:
flyte-binary
and `flyte-core`https://docs.flyte.org/en/latest/deployment/deployment/index.html#helm. It sounds like you are using
flyte-binary
. If you use
flyte-core
there will be separate deployments for flyteadmin, flytepropeller, datacatalog, etc. I don't really know when its best to use
flyte-binary
but if you want to scale to multiple kubernetes clsuters
flyte-core
is the only option. Also "worker" means a goroutine within the flytepropeller process. There can be 1000s of workers per pod using just a few actual CPU cores.
r
I see. Thanks Tom for clarification 🙌 I indeed use the binary deployment. However, it was not clear to me that this is limited in terms of scaling. It’s written in the docu “We recommend the Single Cluster option for a capable though not massively scalable cluster.” Even though using only a single cluster, it does make sense to use the multi-cluster deployment for better scalability.
d
@Rene Penkert welcome to the Flyte community.
flyte-binary
is effectively all the Flyte componentes packed, and flytepropeller can be sharded to accommodate more concurrent executions. Nevertheless,
flyte-core,
while designed for envs with multiple K8s clusters, can be useful to connect that scale-out pattern to observable resources on K8s, like Pods
r
If I need to scale within a single cluster with
flyte-binary
deployment is it sufficient to scale out the deployment of the binary or will that not have any effect?
d
having multiple binary Pods would not be enough I think, you'd need to make use of the Propeller manager. The process to configure it on single-binary doesn't seem to be documented yet
r
not sure that it can actually be applied for the binary Pod as I understand it interacts with the Propeller instances to assign different shards which do not exist spearetely within the binary deployment.
d
@Rene Penkert you are exactly correct. Propeller manager is a separate component that basically acts as a k8s replica set to manage multiple flytepropeller instances. It is necessary, because flyte uses FlyteWorkflow custom resources in k8s to track workflow executions (FlyteAdmin creates a FlyteWorkflow CR and propeller uses it to orchestrate execution). The separate shards use k8s label selectors to ensure that only a single propeller instance is executing on a FlyteWorkflow. If multiple instances attempt to orchestrate the same workflow there can be erroneous scenarios.
to dive into the term "propeller workers" a bit. propeller has a number of goroutines as @Thomas Newton suggested (thanks!). at a high-level, propeller periodically looks at the state of a workflow and attempts to make progress (scheduling new tasks, etc), this logic is performed by a single worker on a single workflow at a time. So scaling the number of propeller workers essentially improves parallelism at the workflow execution level. it is important to note here that the time it takes each worker to progress a workflow should be <1s, so scalability is seldom an issue. Scaling to more workflow executions will not break propeller, rather all workers will be continually used and workflow executions may slow.
r
Thanks a lot! That helps! I will finish my current tests and do another installation with
flyte-core
to see the different components in action 😉 Might come back with some additional questions when I’m at that point 😜 Thanks a lot for the help 👍
k
Great to meet at @Rene Penkert Would love to understand more about the usecase