Does running the flyte control plane on spot instances raise any concerns? Is anyone running their control plane on spot instances and has this been a source of any difficulties?
f
freezing-airport-6809
10/26/2022, 8:12 PM
Performance
freezing-airport-6809
10/26/2022, 8:22 PM
and potential liveness / progress concerns. But not too bad. Preferably put it on reserved machines
t
thankful-dress-89577
10/26/2022, 8:24 PM
Ok thanks.
l
limited-dog-47035
10/26/2022, 8:27 PM
We are, but haven't had many issues. I've been definitely considering moving the control plane to not use spot instances and only have the workloads on spot; I definitely think that the control plane in the long term should not be on spot instances, but trying to save some $$ 😅
👍 1
b
broad-monitor-993
10/26/2022, 8:41 PM
It’s probably also a little more code to use intratask checkpoints in the case e.g. long-running model training.
No one asked, but here’s a more real-world ML example of using this feature: link 🙃