Hi, I heard that offline batch inference is not recommended in flyte, and I better to deploy model to end-point and send requests there. Is it true?
f
freezing-airport-6809
06/25/2025, 5:06 PM
where did you hear this from
freezing-airport-6809
06/25/2025, 5:06 PM
this is infact one of the biggest uses of flyte in the world today
freezing-airport-6809
06/25/2025, 5:06 PM
infact you should not deploy an endpoint
s
steep-nest-3156
06/25/2025, 5:46 PM
Great, thank you!
It popped up, In our internal conversations, maybe a misunderstanding of something. And I couldn't google any direct answer
Just to be completely clear for our internal discussion. Even if the model requires GPU, right?
f
freezing-airport-6809
06/25/2025, 8:07 PM
that is indeed true
freezing-airport-6809
06/25/2025, 8:07 PM
flyte works really nicely with gpus
freezing-airport-6809
06/25/2025, 8:07 PM
now you may want to use unions (reusable) container if you have very short running jobs