e

    Evan Sadler

    2 weeks ago
    Hey! Has anyone gotten apache-beam to run on Flyte natively? I was going to explore this a bit, but I wanted to see if anyone here has thought through this.
    It could be possible to use the spark components to run apache-beam instead of writing a custom backend.
    Ketan (kumare3)

    Ketan (kumare3)

    2 weeks ago
    ya
    but, we have not explored. Would love to understand the usecase
    e

    Evan Sadler

    2 weeks ago
    Apache beam is really good for model inference if you are using python ml packages. The sdk is in python so like no annoying spark Java errors or needing to use special python UDFs or building spark transformers. I want to use it to batch score models in parallel using this https://beam.apache.org/documentation/sdks/python-machine-learning/. Also using GCPs dataflow is cool because auto scaling features are incredible. Vertical and horizontal scaling. Plus you can use GPUs.
    I will see if I can run it using the spark backend…would be so cool if it just kind of worked
    Ketan (kumare3)

    Ketan (kumare3)

    2 weeks ago
    With spark backend it should Just work. Lmk. But I want to understand wdym by flyte backend for beam
    e

    Evan Sadler

    2 weeks ago
    I just meant like a way to spin up and run Apache beam on Flyte.
    Similar to how spark works
    But beam. If spark doesn't work then I know people use Flink, which might support more features around streaming, but I don't really need them…maybe one day!
    Ketan (kumare3)

    Ketan (kumare3)

    2 weeks ago
    Actually flink is already supported. Cc @Filipe Regadas / @Babis Kiosidis how can we upstream https://github.com/spotify/flyte-flink-plugin