Are there any examples of flyte scheduling workloads on CoiledCloud ?
03/30/2022, 8:03 PM
@Mike Ossareh we dont have that integration today. Adding a new integration in python should be simple or you can even simply do it in your own python code
We can potentially even work on a backend integration, if this might help in the future
03/30/2022, 9:52 PM
@Ketan (kumare3) good to know. From our point of view CoiledCloud has a nice ability to scale numpy and pandas df's across multiple nodes. As we migrate our work loads over to flyte we're considering two paths generally:
• migrate them as they are
• rework them as part of the migration
There are aspects about our workloads which are... "classic" in their application of computational model (single node, multiple core) which results in them implementing certain memmapping patterns which has a large impact downstream in terms of assumptions about the data.
We like the "modern" approach that flyte encourages - i.e. directories full of files and it handles how they're allocated to pods to produce parallelism etc, vs one big pickle file.
It's in this "classic" vs "modern" (I've just chosen these two opposing nouns, I'm not sure they really fit tbh) framework that we're considering reworking some parts of our pipeline as we migrate.
it's probs obvious, but i just wanted to point out: CoiledCloud is hosted Dask - https://dask.org/ they handle the provisioning of compute nodes in response to dask powered work loads, etc.
03/30/2022, 10:14 PM
ya we would love to actually integrate with them
some other folks asked about it - cc @Bernhard Stadlbauer
if you would like to help with it, we would love to partner up
we especially need help with the UX
Flyte can also orchestrate Dask clusters on K8s directly
and thus users should be able to switch between coiled vs dask on k8s (native)