Hello, I am an ML Engineer/MLOps Engineer. We’re c...
# introductions
b
Hello, I am an ML Engineer/MLOps Engineer. We’re currently using Kubeflow Pipelines, and because of several pain points, and been looking for potential alternatives. We’re on GCP, so I’m looking forward to the terraform files for that deployment so we can give it a try and see if it will work for our needs
🚀 2
👋 7
a
wohoo welcome @big-notebook-82371 and thanks for the info. Feel free to test the modules The upstream PR should be merged soon. I hope this help you get started quickly. Any problem/question you may have in the process, please let me know!
b
Oh thanks for that link, I can definitely start there. I’m at the very beginning of looking into it, but we’ll be scaling up our number of pipelines significantly over the next several months, and we need to be sure our orchestration can handle that
a
great, we have users here running 10's of thousands of workflows without sacrificing KPIs. Any help you need during the process, feel free to #CP2HDHKE1 !
b
Awesome, thanks!
c
@damp-family-58471 very interest to hear your journey with and from Kubeflow
🤘 1
b
@clean-agent-63333 As for looking for another solution, it’s probably partially due to kubeflow, and partially due to not knowing all the best practices (and having a hard time finding them online). Things just randomly slow down, it doesn’t seem to be scaling super well, local development is super rough, pipelines feel fragile because I don’t know what changes will randomly break it (local testing doesn’t pick up environment stuff, etc. So anyway, I feel like there isn’t much support for it online, and we need something that will work at a much bigger scale than we’re at right now, so I’m looking for alternatives. If you have any advice that could maybe make me stay, I’m open, haha. We’re just using the pipelines standalone deployment. But so far flyte is looking promising.
c
How big is the team that supports your KFP deployment?
b
Only two of us use it right now, with plans to hire a handful more over the next several months
👍 1
f
cc @future-monitor-58430 / @elegant-australia-91422 and others please chime in about kubeflow
🙏 1
👍 1
c
I gotta say, I would give Flyte a fair go, especially if you are just currently using Kubeflow Pipelines. The deployment I work with at Capital One is using the suite of tools that* Kubeflow provides (e.g. Notebooks, Pipelines, Training Operator, etc.) so it’s been nice to keep a single ecosystem. But we have a very large SRE team with a lot of clusters we maintain.
👍 1
f
Flyte works with training operator
🔥 1
Would love to understand what all you folks use to see good integration points
b
@average-finland-92144 is that terraform setup specifically for flyte-core, which I understand to be the multi-cluster setup? I just know the multi-cluster setup is much more complicated. I’m wondering if it would be any harder to maintain, and what advantages there are to that over the flyte-binary setup?
a
The soon-to-be-merged GCP modules are for flyte-core yes, mainly bc that chart offered a more straightforward way to populate keys from TF. You don't need to run multicluster to use flyte-core, most of the Flyte components (propeller, admin, etc) will be Deployments ensuring availability. TF modules for single binary on GCP is also planned and should be out soon. In terms of maintenance I guess the resource footprint of
flyte-core
could be a bit higher but generally not a concern
b
Ok, sounds good, I’ll probably give it a try. One more question. Do the TF modules then make it so you don’t need to run the helm setup in the docs at all? It looks like the flyte.tf file is managing that, is that right?
a
correct, you just need to edit a couple of lines in
<http://locals.tf|locals.tf>
and the modules set up everything, including the Helm release
b
perfect, sounds great
a
if you have any further question using them, I guess we can continue the convo on #C05A0JA1CCD 🙂
🫡 1