sparse-pencil-33953
07/22/2023, 5:59 PMhot
containers for tasks? I suppose we could have a task call out to a pool of preheated inference VMs but I worry we'd be essentially undermining the point of Flytes scheduling. All thoughts are welcome, thanks!elegant-australia-91422
07/22/2023, 8:51 PMDeployment
that can scale according to capacity (possibly w/ a HPA) and expose inference via an endpoint (either sending the inputs over the wire or passing a pointer to the inputs in some object store/db)elegant-australia-91422
07/22/2023, 8:53 PMsparse-pencil-33953
07/22/2023, 9:20 PMWhat's the motivation behind keeping a container "warm"?
- Well, some models can take upwards of a minute or more just to load into memory. We would like to be able to dynamically schedule DAG's of operations which may include expensive inference tasks. For those expensive takes it would be nice to reuse warm
containers.elegant-australia-91422
07/22/2023, 9:27 PMsparse-pencil-33953
07/22/2023, 9:33 PMelegant-australia-91422
07/22/2023, 9:52 PMelegant-australia-91422
07/22/2023, 9:53 PMfreezing-airport-6809
freezing-airport-6809
freezing-airport-6809
freezing-airport-6809
sparse-pencil-33953
07/24/2023, 4:01 PM