jolly-florist-66728
10/18/2024, 11:30 AM@task(task_config=RayJobConfig(worker_node_config=[WorkerNodeConfig(…)]))
). I can see the cluster starting, the job getting scheduled and distributed, and completing successfully.
I’m having trouble with using an existing RayCluster (in the same cluster) though. What is the correct approach for that?
From the docs here I read that I should be able to use @task(task_config=RayJobConfig(address="<RAY_CLUSTER_ADDRESS>"))
.
However when trying that it seems worker_node_config
is a required argument. I tried using an empty list instead:
@task(
container_image=...,
task_config=RayJobConfig(
worker_node_config=[], # No need to create a Ray cluster but argument is required, maybe just setting to empty list helps?
address="kuberay-cluster-head-svc.kuberay.svc.cluster.local:8265",
runtime_env=...
),
)
But then it tries to start a new RayCluster instead of using the existing one found at `address`:
❯ k get <http://rayclusters.ray.io|rayclusters.ray.io> -A
NAMESPACE NAME DESIRED WORKERS AVAILABLE WORKERS CPUS MEMORY GPUS STATUS AGE
<flyte-project-<flyte-domain> ahvfr924w8k2vgvf97wp-n0-0-raycluster-crb9z 100m 500Mi 0 ready 2m25s
kuberay kuberay-cluster 1 1 2 3G 0 ready 3h37m
...
# Answer
To use an existing RayCluster with Flyte, you should specify the address
parameter in the RayJobConfig
to connect to the existing cluster. However, the worker_node_config
is typically used to define new clusters, and setting it to an empty list might not prevent Flyte from attempting to create a new cluster.
According to Flyte documentation, you can connect to an existing Ray cluster by specifying the address
in the RayJobConfig
. However, the current implementation might not fully support this use case, as it seems to default to creating a new cluster.
Would you like more details on how to potentially work around this issue or further explore the configuration options?
Sources:
- Flyte Ray Plugin Documentation
Slack Link
https://.slack.com/archives/CP2HDHKE1/p1729250994.610059 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.