Hi, this may be a newbie question, but should I be...
# ask-the-community
Hi, this may be a newbie question, but should I be using burstable instances for flyte workflows or should I avoid them? Since ML/AI workflows typically run for a short duration, I’m wondering if I should avoid burstable instances?
Ml ai workflows actually can run for days right
Right - but it runs at full capacity for the whole time so I figured maybe burstable instances are not well suited? Correct me if I’m wrong
no harm in using burstable probably. it’ll just run slower than you might be comfortable with. could consider using spot instances with checkpointing for likely better savings.
Right that makes sense - thanks
Actually it looks like that might not be the case - see this post if you’re interested: https://blog.coiled.io/blog/burstable-vs-nonburstable.html
“Coiled runs a suite of Python/Dask benchmarks on AWS. This collection of ~100 benchmarks cover common data science, data engineering, and machine learning workloads. A typical run takes around 150 instance-hours (though as we’ll see, this depends on the choice of instance type). When we switched from burstable to non-burstable instance types, we saw a significant reduction in cost. For instance, comparing two individual runs, we saw costs go from $7.13 (burstable) to $3.89 (non-burstable).”
seems reasonable
but it’s probably workload-specific. spot + check pointing will still give you good savings though, assuming you can do it correctly
Dask and Flyte are very different- in terms of how they use the machines. But ya if we want max compute performance it might help for sustained compute requirements
Are there any benchmarks from Flyte?
For Compute - nope
We are an orchestrator so we benchmark number of tasks and workflows
Right - so given a similar data/ML workloads, wouldn’t dask’s benchmark still be roughly same on Flyte?
Or am I missing something?
Might be, have not read it all