@freezing-airport-6809 good to know. From our point of view CoiledCloud has a nice ability to scale numpy and pandas df's across multiple nodes. As we migrate our work loads over to flyte we're considering two paths generally:
⢠migrate them as they are
⢠rework them as part of the migration
There are aspects about our workloads which are... "classic" in their application of computational model (single node, multiple core) which results in them implementing certain memmapping patterns which has a large impact downstream in terms of assumptions about the data.
We like the "modern" approach that flyte encourages - i.e. directories full of files and it handles how they're allocated to pods to produce parallelism etc, vs one big pickle file.
It's in this "classic" vs "modern" (I've just chosen these two opposing nouns, I'm not sure they really fit tbh) framework that we're considering reworking some parts of our pipeline as we migrate.