<@U03F38WVDHN> - these are the Ray PRs we’re worki...
# ray-on-flyte
t
@rapid-autumn-97122 - these are the Ray PRs we’re working on. https://github.com/flyteorg/flyteidl/pull/308 https://github.com/flyteorg/flytekit/pull/1093 https://github.com/flyteorg/flyteplugins/pull/279 responding to your comment on the first one, but mind taking a quick look at the other two as well?
@glamorous-carpet-83516’s shooting to merge these tomorrow, assuming they’re okay. After that, I think we’re actually going to do a patch Flyte release (like the whole platform, since propeller and admin will need to pick up the new IDL)
do you think you’ll be able to help test/vet after that?
kevin’s still working on getting the dashboard hooked up, but you had mentioned something about metrics as well that you were interested in. will think about ways to enable those as well (though can’t promise anything just yet)
🔥 1
r
Thanks! Yeah, I started looking into these PRs
Hey @thankful-minister-83577, I drop a few comments in the PRs. They are mostly around the flexibility of the Ray cluster configuration. Essentially we will need more settings such as service account and image in Ray cluster config
do you think you’ll be able to help test/vet after that?
yep, I’m happy to dogfood it!
f
@rapid-autumn-97122 can we use the same serviceaccount as the workflow service account?
r
You mean the service account for flyte workflow? I think they are different
f
why?
they are k8s serviceaccounts
r
but they are run on different clusters
f
no? we are running a K8sCRD
they are all on the same cluster and even the same namespace
r
that wont work for us though - flyte workflows are executed in their own infra and in a flyte workflow, we need to create a cluster in the Ray infra
so when you create the Ray cluster, do you assume you have Ray CRD installed on the flyte cluster?
f
thats the current setup right - cc @glamorous-carpet-83516?
r
we might need to sync on this - in our setup, Ray and Flyte are run on different clusters. KubeRay also has an API server, I wonder if that will make things easier 🤔
f
so Flyte does support running on different clusters
but my recommendation is to run just flyte in multi-cluster mode
this is so much better than using the same cluster
that way, data stuff cannot impact ML stuff
we actually ran 13 clusters at Lyft
r
yes exactly! we have dedicated flyte cluster for orchestration, some teams using flink cluster for data, and now we have ray clusters
f
ya
but a better model is to run flytepropeller on different clusters
keep the central control plane independent and just parachute flytepropeller to multiple clusters
and it can connect back,
r
Gotcha!
f
otherwise the one propeller will start getting overloaded at (very high scale), but it will
and one good property of having flytepropeller manage your cluster is, it will know the scheduler how the resources are getting throttled and work correctly
👍 1
g
thats the current setup right - cc
yes, we have to install Ray CRD on flyte cluster