Hey Everyone I m trying to create workflows around video pro Flyte #flyte-support

Hey Everyone! I'm trying to create workflows aroun...

eager-butcher-27719

06/06/2022, 7:07 PM

Hey Everyone! I'm trying to create workflows around video processing, splitting up the data into chunks and processing frames with ML then coalescing the information back together. There are different stages of extracting information from the video/images in parallel. Previously, our system relied on using google cloud functions to split up processing then had a worker that would keep track of all processes, which was a bit verbose. Trying to simplify it with Flyte, but unsure where to get started. Any tips would be appreciated. Thanks!

glamorous-carpet-83516

06/06/2022, 7:27 PM

I think you can try to use map task in flyte. Flyte can launch a pod for each chunk of data, and run preprocessing in parallel https://docs.flyte.org/projects/cookbook/en/latest/auto/core/control_flow/map_task.html#sphx-glr-auto-core-control-flow-map-task-py

freezing-airport-6809

06/06/2022, 7:56 PM

or you can use Dynamic Workflows

freezing-airport-6809

06/06/2022, 7:57 PM

also docs here - https://docs.flyte.org/projects/cookbook/en/latest/auto/core/control_flow/dynamics.html#sphx-glr-auto-core-control-flow-dynamics-py

freezing-airport-6809

06/06/2022, 7:58 PM

one of the problems with simply processing video into one frame might be too expensive for Flyte (at the moment - more coming later). This is because, it will spawn a new pod for every task execution today

eager-butcher-27719

06/06/2022, 8:42 PM

how long does spawning up new pods take?

eager-butcher-27719

06/06/2022, 8:42 PM

for our processing, we currently just batch up the video, so a chunk of video will be processed in a "task"

freezing-airport-6809

06/06/2022, 10:49 PM

Spawning new pods depends on a few things. Network and size of containers and ip addresses

freezing-airport-6809

06/06/2022, 10:49 PM

So lowest I have seen is 1-2 second

eager-butcher-27719

06/06/2022, 10:52 PM

Is it linear? So if I'm spinning up a couple thousand parallel tasks, will it take super long?

freezing-airport-6809

06/06/2022, 10:53 PM

hmm no, it amortizes

freezing-airport-6809

06/06/2022, 10:53 PM

but couple thousand may take time

freezing-airport-6809

06/06/2022, 10:53 PM

what is the end goal

freezing-airport-6809

06/06/2022, 10:53 PM

how fast to you want t6he end to end

eager-butcher-27719

06/06/2022, 10:55 PM

I mean first goal (which is the primary motivation for me switching) is by far reliability. With cloud functions, messages in the queue sometimes get dropped, or take super long to get acknowledged, and a friend of mine recommended Flyte so I thought I'd look into it. Currently I've been having to handle a lot of edge cases where data is processed multiple times or dropped entirely

freezing-airport-6809

06/06/2022, 10:55 PM

ohh so reliability will be so much better, also with caching and recovery you will see much better outputs

freezing-airport-6809

06/06/2022, 10:55 PM

let me ping you

eager-butcher-27719

06/06/2022, 10:56 PM

For end to end it would cool to have an essentially logarithmic processing time for video. We'd love to have a 60 second video process in 60 seconds, and then logarithmically scale afterwards

freezing-airport-6809

06/06/2022, 10:57 PM

i think that can be done, if you chunk the video correctly

freezing-airport-6809

06/06/2022, 10:57 PM

so for Flyte large the chunk size, better it is

165 Views

Open in Slack

Previous Next