Thread
#ray-on-flyte
    Yee

    Yee

    2 months ago
    @Keshi Dai - these are the Ray PRs we’re working on. https://github.com/flyteorg/flyteidl/pull/308 https://github.com/flyteorg/flytekit/pull/1093 https://github.com/flyteorg/flyteplugins/pull/279 responding to your comment on the first one, but mind taking a quick look at the other two as well?
    @Kevin Su’s shooting to merge these tomorrow, assuming they’re okay. After that, I think we’re actually going to do a patch Flyte release (like the whole platform, since propeller and admin will need to pick up the new IDL)
    do you think you’ll be able to help test/vet after that?
    kevin’s still working on getting the dashboard hooked up, but you had mentioned something about metrics as well that you were interested in. will think about ways to enable those as well (though can’t promise anything just yet)
    Keshi Dai

    Keshi Dai

    2 months ago
    Thanks! Yeah, I started looking into these PRs
    Hey @Yee, I drop a few comments in the PRs. They are mostly around the flexibility of the Ray cluster configuration. Essentially we will need more settings such as service account and image in Ray cluster config
    do you think you’ll be able to help test/vet after that?
    yep, I’m happy to dogfood it!
    Ketan (kumare3)

    Ketan (kumare3)

    2 months ago
    @Keshi Dai can we use the same serviceaccount as the workflow service account?
    Keshi Dai

    Keshi Dai

    2 months ago
    You mean the service account for flyte workflow? I think they are different
    Ketan (kumare3)

    Ketan (kumare3)

    2 months ago
    why?
    they are k8s serviceaccounts
    Keshi Dai

    Keshi Dai

    2 months ago
    but they are run on different clusters
    Ketan (kumare3)

    Ketan (kumare3)

    2 months ago
    no? we are running a K8sCRD
    they are all on the same cluster and even the same namespace
    Keshi Dai

    Keshi Dai

    2 months ago
    that wont work for us though - flyte workflows are executed in their own infra and in a flyte workflow, we need to create a cluster in the Ray infra
    so when you create the Ray cluster, do you assume you have Ray CRD installed on the flyte cluster?
    Ketan (kumare3)

    Ketan (kumare3)

    2 months ago
    thats the current setup right - cc @Kevin Su?
    Keshi Dai

    Keshi Dai

    2 months ago
    we might need to sync on this - in our setup, Ray and Flyte are run on different clusters. KubeRay also has an API server, I wonder if that will make things easier 🤔
    Ketan (kumare3)

    Ketan (kumare3)

    2 months ago
    so Flyte does support running on different clusters
    but my recommendation is to run just flyte in multi-cluster mode
    this is so much better than using the same cluster
    that way, data stuff cannot impact ML stuff
    we actually ran 13 clusters at Lyft
    Keshi Dai

    Keshi Dai

    2 months ago
    yes exactly! we have dedicated flyte cluster for orchestration, some teams using flink cluster for data, and now we have ray clusters
    Ketan (kumare3)

    Ketan (kumare3)

    2 months ago
    ya
    but a better model is to run flytepropeller on different clusters
    keep the central control plane independent and just parachute flytepropeller to multiple clusters
    and it can connect back,
    Keshi Dai

    Keshi Dai

    2 months ago
    Gotcha!
    Ketan (kumare3)

    Ketan (kumare3)

    2 months ago
    otherwise the one propeller will start getting overloaded at (very high scale), but it will
    and one good property of having flytepropeller manage your cluster is, it will know the scheduler how the resources are getting throttled and work correctly
    Kevin Su

    Kevin Su

    1 month ago
    thats the current setup right - cc
    yes, we have to install Ray CRD on flyte cluster