I am evaluating Flyte for data engineering. When i...
# ask-the-community
f
I am evaluating Flyte for data engineering. When is Flyte not a good choice (compared to Dagster - another candidate)? The data science group uses Metaflow and they are actively advocating for us to adopt it too. Is Flyte comparable to data science tools like Metaflow apart from the common workflow orchestration bit?
k
It is absolutely comparable to metaflow and is probably better at ML related tasks than metaflow. It’s definitely more on thr ML side than data engineering
For example take a look at pytorch, onnx, tensorflow integrations and also data frames. Also map tasks, spark etc. it has opinionated docker build system for folks who want to use it, it maintains strong reproducibility and is fast to Iterate on. We would love to know from the data science team if they think anything in Flyte can be improved for them.
I think the community would be happy to chime in here
f
Are you able to share more specifics? I can't go back to my team with high-level generalizations like "absolutely comparable and probably better", that would look quite stupid on my part 🙂 They were on Kubeflow before which had these integrations too, but they never worked well in practice.
To the original question - is it not recommended to use Flyte for ETL use cases then?
k
Sadly kubeflow made it bad for everyone
d
@foo bar This is an example implementation of ETL using Flyte https://docs.flyte.org/projects/cookbook/en/latest/getting_started/data_engineering.html#data-engineering that could probably give you an idea.
468 Views