Does running the flyte control plane on spot instances raise any concerns? Is anyone running their control plane on spot instances and has this been a source of any difficulties?
k
Ketan (kumare3)
10/26/2022, 8:12 PM
Performance
and potential liveness / progress concerns. But not too bad. Preferably put it on reserved machines
a
Andrew Achkar
10/26/2022, 8:24 PM
Ok thanks.
k
Katrina P
10/26/2022, 8:27 PM
We are, but haven't had many issues. I've been definitely considering moving the control plane to not use spot instances and only have the workloads on spot; I definitely think that the control plane in the long term should not be on spot instances, but trying to save some $$ 😅
n
Niels Bantilan
10/26/2022, 8:41 PM
It’s probably also a little more code to use intratask checkpoints in the case e.g. long-running model training.
No one asked, but here’s a more real-world ML example of using this feature: link 🙃