Steve Hiehn
07/22/2023, 5:59 PMhot
containers for tasks? I suppose we could have a task call out to a pool of preheated inference VMs but I worry we'd be essentially undermining the point of Flytes scheduling. All thoughts are welcome, thanks!Rahul Mehta
07/22/2023, 8:51 PMDeployment
that can scale according to capacity (possibly w/ a HPA) and expose inference via an endpoint (either sending the inputs over the wire or passing a pointer to the inputs in some object store/db)Steve Hiehn
07/22/2023, 9:20 PMWhat's the motivation behind keeping a container "warm"?
- Well, some models can take upwards of a minute or more just to load into memory. We would like to be able to dynamically schedule DAG's of operations which may include expensive inference tasks. For those expensive takes it would be nice to reuse warm
containers.Rahul Mehta
07/22/2023, 9:27 PMSteve Hiehn
07/22/2023, 9:33 PMRahul Mehta
07/22/2023, 9:52 PMKetan (kumare3)
Steve Hiehn
07/24/2023, 4:01 PM