Flyte enables production-grade orchestration for machine learning workflows and data processing created to accelerate local workflows to production.

Flyte

Hi All ! Happy New Year !
I have a `spark` related doubt here.
*Scenario:* 
Currently while executing the spark workflow The driver and the executors are being scheduled in different pods.
Eg: we have  `1 driver (4 cores CPU 8GB Memory`) and  `4 executors(4 cores CPU 8GB Memory each)` -&gt; 1 node for 1 driver pod and 1 node to accommodate all 4 executor pods.
Here the node to accommodate the executors is very large as the request sent for the node is the summation of CPU's and memory of all 4 executors combined so the request is greater than 16 cores and 32 GB memory which will be inefficient going forward with more number of executors.

So is there a workaround/fix to make this scale horizontally i.e... spawn up 4 executor pods in separate nodes (or a combination of n nodes to hold n executor pods each node) so we will have nodes running in parallel instead of pods running in parallel inside a `single very large node` ?

Any thoughts on the procedures or workarounds will be appreciated

<@UNZB4NW3S>, any idea how this can be achieved? I was reading through the solutions and came across this issue: <https://github.com/apache/spark/pull/36358>, but I’m not very sure.

<@U043MNRBVFG> it will autoscale. Why do you care if it is on one node today? If you add more, more nodes will be spun up as long as you have your kubernetes configured with more nodes 

I did try with 2 drivers and 6 executors (4core 8gb) each. 1st driver had 2 executors and other had 4 executors.upon execution what it did was it tried to accommodate all 6 executors in a single node and it spawned up a very big 36core cpu and 100gb memory instance and executed the task. So to control this behaviour I was looking for a fix if we are able to schedule the executors in different nodes. Will check the limit for nodes in kubernetes in our EKS

We dont have limit on number of nodes we can spawn but we do have a limit on total mem and cpu of all nodes combined in karpentar.

so i guess what is happening is instead of spawning small nodes to take "n" number of  executor pods per node it is spawning a very large node to accommodate all "n" executors in a single node.

if the total memory of the worker nodes exceeds the k8s node’s memory, spark operator will create the some worker nodes on other k8s nodes. doesn’t it?

I guess, this is karpenter. Cc <@UNW4VP36V>