Steve Hiehn07/22/2023, 5:59 PM
containers for tasks? I suppose we could have a task call out to a pool of preheated inference VMs but I worry we'd be essentially undermining the point of Flytes scheduling. All thoughts are welcome, thanks!
Rahul Mehta07/22/2023, 8:51 PM
that can scale according to capacity (possibly w/ a HPA) and expose inference via an endpoint (either sending the inputs over the wire or passing a pointer to the inputs in some object store/db)
Steve Hiehn07/22/2023, 9:20 PM
- Well, some models can take upwards of a minute or more just to load into memory. We would like to be able to dynamically schedule DAG's of operations which may include expensive inference tasks. For those expensive takes it would be nice to reuse
What's the motivation behind keeping a container "warm"?
Rahul Mehta07/22/2023, 9:27 PM
Steve Hiehn07/22/2023, 9:33 PM
Rahul Mehta07/22/2023, 9:52 PM
Steve Hiehn07/24/2023, 4:01 PM