Hi, this may be a newbie question, but should I be...
# ask-the-community
s
Hi, this may be a newbie question, but should I be using burstable instances for flyte workflows or should I avoid them? Since ML/AI workflows typically run for a short duration, I’m wondering if I should avoid burstable instances?
k
Ml ai workflows actually can run for days right
s
Right - but it runs at full capacity for the whole time so I figured maybe burstable instances are not well suited? Correct me if I’m wrong
j
no harm in using burstable probably. it’ll just run slower than you might be comfortable with. could consider using spot instances with checkpointing for likely better savings.
s
Right that makes sense - thanks
Actually it looks like that might not be the case - see this post if you’re interested: https://blog.coiled.io/blog/burstable-vs-nonburstable.html
“Coiled runs a suite of Python/Dask benchmarks on AWS. This collection of ~100 benchmarks cover common data science, data engineering, and machine learning workloads. A typical run takes around 150 instance-hours (though as we’ll see, this depends on the choice of instance type). When we switched from burstable to non-burstable instance types, we saw a significant reduction in cost. For instance, comparing two individual runs, we saw costs go from $7.13 (burstable) to $3.89 (non-burstable).”
j
seems reasonable
but it’s probably workload-specific. spot + check pointing will still give you good savings though, assuming you can do it correctly
s
Right
k
Dask and Flyte are very different- in terms of how they use the machines. But ya if we want max compute performance it might help for sustained compute requirements
s
Are there any benchmarks from Flyte?
k
For Compute - nope
We are an orchestrator so we benchmark number of tasks and workflows
s
Right - so given a similar data/ML workloads, wouldn’t dask’s benchmark still be roughly same on Flyte?
Or am I missing something?
k
Might be, have not read it all
148 Views