Hi all! I'm new to Flyte. My workloads consist of ...
# flyte-support
s
Hi all! I'm new to Flyte. My workloads consist of slow-running tasks, for which Flyte is perfectly suited, as well as fast-running tasks. The latter tasks might need some set-up time, however, such as loading a neural network to GPU (for model inference). Before moving to Flyte, we've been using a HTTP server to execute those tasks, e.g. with FastAPI. Running these as Flyte tasks seems to add unnecessary overhead, as the setup needs to be redone for each execution. What would be the best/idiomatic way of handling this? I'm currently thinking of running a HTTP server, and then running a flyte task to make a HTTP request. This seems a little duplicious though. Thanks!
a
s
Thanks! So you'd then make a task for each HTTP endpoint, and have a single agent to dispatch the HTTP requests?
Or would you have the agent itself execute the requests (without an HTTP server)
g
is your task CPU bound? if so, I think it’s better to run multiple http servers, and use agent to dispatch requests to them
s
Yeah, it's CPU (or GPU) bound
a
Would you do the same for long running comptutations? Would you implement a separate queue for long running computations with something like rabbitmq or is there a better way and keep it within Flyte?
g
what kind of long running task? If it’s long running computation, you could just use regular python @task to run it in a pod
f
@silly-book-73230 if you are open to it, we at Union have built a new feature called Actors , which reuses containers, can allow you to pin models to memory and can run tasks in milliseconds
Would you be open to talking more about this?
@aloof-magazine-44547 you should definitely keep it within flyte, we thinking keeping it simple makes it much much better
Actors are completely designed for this usecase.
s
Thanks for the pointer! That indeed looks like what we need. I think we'll evaluate Flyte first, and consider Union later