We have a need to spin up virtual machines in the ...
# ask-the-community
g
We have a need to spin up virtual machines in the cloud that run a simulation. One option to support this use is to write a Flyte agent that spins up the job. Another option is to write a Flyte task that spins up the job, polls for completion, and then downloads the outputs. What do we get from using a flyte agent (option 1) that we couldn't get from a long-running task (option 2)?
s
cc @Kevin Su
y
a long running task will take up space/resources. an agent runs as part of pod that can run multiple agents, like a service. it’ll be up all the time (so if you have no other agents, technically it may take up more resources). the one agent will also always be up and and can run multiple task instances, whereas in the long poll case, it sounds like you’d be running one polling task for every real task.
k
yes, if your task is not running in the k8s, it’s better to create an agent to submit your job to external system.
that we couldn’t get from a long-running task (option 2)?
what does this long-running task do? if it just send a http request to the cloud, it will waste many resources because propeller launch a pod for each task.
g
How does flyte decide when it needs another replica of the pod hosting an agent? Is there some kind of pod autoscaler deployed by default? Otherwise I worry the agent will run out of resources under heavy loud (thousands of concurrent tasks)
y
not that I know of. but thousands of concurrent tasks is still pretty light I feel. I know kevin did some load testing a while back.
g
Will try and report back!
e
The agents pods are running behind a k8s
Deployment
and even though we don't ship a pod autoscaler in our helm charts, it'd be pretty simple to add one.