Hey! Has anyone gotten apache-beam to run on Flyte...
# announcements
e
Hey! Has anyone gotten apache-beam to run on Flyte natively? I was going to explore this a bit, but I wanted to see if anyone here has thought through this.
It could be possible to use the spark components to run apache-beam instead of writing a custom backend.
k
ya
but, we have not explored. Would love to understand the usecase
e
Apache beam is really good for model inference if you are using python ml packages. The sdk is in python so like no annoying spark Java errors or needing to use special python UDFs or building spark transformers. I want to use it to batch score models in parallel using this https://beam.apache.org/documentation/sdks/python-machine-learning/. Also using GCPs dataflow is cool because auto scaling features are incredible. Vertical and horizontal scaling. Plus you can use GPUs.
I will see if I can run it using the spark backend…would be so cool if it just kind of worked
k
With spark backend it should Just work. Lmk. But I want to understand wdym by flyte backend for beam
e
I just meant like a way to spin up and run Apache beam on Flyte.
Similar to how spark works
But beam. If spark doesn't work then I know people use Flink, which might support more features around streaming, but I don't really need them…maybe one day!
k
Actually flink is already supported. Cc @Filipe Regadas / @Babis Kiosidis how can we upstream https://github.com/spotify/flyte-flink-plugin
160 Views