Hi All, I am working on integrating Ray with Flyte. I have been able to <register> and run the ray t...
n

Nandakumar Raghu

over 2 years ago
Hi All, I am working on integrating Ray with Flyte. I have been able to register and run the ray task and it completes successfully. But I am not able to find any logs anywhere saying that the task was run through ray. Also, I can't see any pods being created / destroyed. There is a ray cluster created, but it is also not destroyed after the task run. I have installed Ray operator, ray cluster and ray api-server using their helm charts. And I have added the configmap in the
inline
section of the
configuration
in values.yaml.
configuration:
  inline:
    configmap:
      enabled_plugins:
        # -- Task specific configuration [structure](<https://pkg.go.dev/github.com/flyteorg/flytepropeller/pkg/controller/nodes/task/config#GetConfig>)
        tasks:
          # -- Plugins configuration, [structure](<https://pkg.go.dev/github.com/flyteorg/flytepropeller/pkg/controller/nodes/task/config#TaskPluginConfig>)
          task-plugins:
            # -- [Enabled Plugins](<https://pkg.go.dev/github.com/flyteorg/flyteplugins/go/tasks/config#Config>). Enable SageMaker*, Athena if you install the backend
            # plugins
            enabled-plugins:
              - container
              - sidecar
              - k8s-array
              - ray
            default-for-task-types:
              container: container
              sidecar: sidecar
              container_array: k8s-array
              ray: ray
I have all the ray pods running -
NAME                                                 READY   STATUS    RESTARTS   AGE
flyte-flyte-binary-6cfdcfc575-9l42x                  1/1     Running   0          3d2h
flyte-ray-cluster-kuberay-head-9q6jq                 1/1     Running   0          147m
flyte-ray-cluster-kuberay-worker-workergroup-bts8b   1/1     Running   0          147m
kuberay-apiserver-d7bbb9864-htsw4                    1/1     Running   0          97m
kuberay-operator-55c84695b8-vftmn                    1/1     Running   0          11h
And also all the services -
NAME                                 TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                                         AGE
flyte-flyte-binary-grpc              ClusterIP   x.x.x.x.   <none>        8089/TCP                                        3d3h
flyte-flyte-binary-http              ClusterIP   x.x.x.x.   <none>        8088/TCP                                        3d3h
flyte-flyte-binary-webhook           ClusterIP   x.x.x.x.    <none>        443/TCP                                         3d3h
flyte-ray-cluster-kuberay-head-svc   ClusterIP   x.x.x.x.    <none>        10001/TCP,6379/TCP,8265/TCP,8080/TCP,8000/TCP   166m
kuberay-apiserver-service            NodePort    x.x.x.x.   <none>        8888:31888/TCP,8887:31887/TCP                   116m
kuberay-operator                     ClusterIP   x.x.x.x.    <none>        8080/TCP                                        3d2h
Questions: 1. Have I configured flyte to use ray correctly using the configmap in values.yaml? 2. How do I verify that the ray task that Flyte says was successful was indeed run on a ray cluster?