Martin Hwasser
08/22/2022, 1:21 PMKetan (kumare3)
Martin Hwasser
08/22/2022, 2:19 PMKetan (kumare3)
Niels Bantilan
08/22/2022, 2:36 PMthree important pieces which are experiment tracking/model registry/data versioning.I think for experiment tracking we’re planning on relying on integrations with other libraries in the ecosystem (e.g. mlflow) Re: model registry and data versioning and data lineage, what are your requirements? Flyte basically tracks all artifacts at the interface of tasks and workflow (in addition to all dependencies in the execution graph), including models and data, but would be interested in chatting to understand what your needs are.
Yee
Fredrik
08/22/2022, 4:35 PMMartin Hwasser
08/22/2022, 6:03 PMI think for experiment tracking we’re planning on relying on integrations with other libraries in the ecosystem (e.g. mlflow)That makes sense @Niels Bantilan. For the tracking, we mostly need rudimentary features (eg graphs). For the model registry, we want a centralized thin abstraction layer in front of a blob storage (eg s3) that can reference arbitrary models and metadata. MLFlow suits our needs pretty well here. When it comes to data lineage, we’re currently using DVC but I’m currently looking for alternatives as using git to track revisions isn’t great when domain experts need to be able to manipulate the datasets (fix incorrect labels etc).
Yee
Martin Hwasser
08/22/2022, 6:12 PMYee
Martin Hwasser
08/22/2022, 6:16 PMYee
Martin Hwasser
08/22/2022, 6:19 PMYee
Fredrik
08/22/2022, 6:19 PMYee
Fredrik
08/22/2022, 6:21 PMYee