Hi All,
I am working on integrating Ray with Flyte. I have been able to
register and run the ray task and it completes successfully. But I am not able to find any logs anywhere saying that the task was run through ray. Also, I can't see any pods being created / destroyed. There is a ray cluster created, but it is also not destroyed after the task run.
I have installed Ray operator, ray cluster and ray api-server using their helm charts. And I have added the
configmap in the
inline
section of the
configuration
in values.yaml.
configuration:
inline:
configmap:
enabled_plugins:
# -- Task specific configuration [structure](<https://pkg.go.dev/github.com/flyteorg/flytepropeller/pkg/controller/nodes/task/config#GetConfig>)
tasks:
# -- Plugins configuration, [structure](<https://pkg.go.dev/github.com/flyteorg/flytepropeller/pkg/controller/nodes/task/config#TaskPluginConfig>)
task-plugins:
# -- [Enabled Plugins](<https://pkg.go.dev/github.com/flyteorg/flyteplugins/go/tasks/config#Config>). Enable SageMaker*, Athena if you install the backend
# plugins
enabled-plugins:
- container
- sidecar
- k8s-array
- ray
default-for-task-types:
container: container
sidecar: sidecar
container_array: k8s-array
ray: ray
I have all the ray pods running -
NAME READY STATUS RESTARTS AGE
flyte-flyte-binary-6cfdcfc575-9l42x 1/1 Running 0 3d2h
flyte-ray-cluster-kuberay-head-9q6jq 1/1 Running 0 147m
flyte-ray-cluster-kuberay-worker-workergroup-bts8b 1/1 Running 0 147m
kuberay-apiserver-d7bbb9864-htsw4 1/1 Running 0 97m
kuberay-operator-55c84695b8-vftmn 1/1 Running 0 11h
And also all the services -
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
flyte-flyte-binary-grpc ClusterIP x.x.x.x. <none> 8089/TCP 3d3h
flyte-flyte-binary-http ClusterIP x.x.x.x. <none> 8088/TCP 3d3h
flyte-flyte-binary-webhook ClusterIP x.x.x.x. <none> 443/TCP 3d3h
flyte-ray-cluster-kuberay-head-svc ClusterIP x.x.x.x. <none> 10001/TCP,6379/TCP,8265/TCP,8080/TCP,8000/TCP 166m
kuberay-apiserver-service NodePort x.x.x.x. <none> 8888:31888/TCP,8887:31887/TCP 116m
kuberay-operator ClusterIP x.x.x.x. <none> 8080/TCP 3d2h
Questions:
1. Have I configured flyte to use ray correctly using the configmap in values.yaml?
2. How do I verify that the ray task that Flyte says was successful was indeed run on a ray cluster?