I’m going through the plugin integrations trying t...
# ray-integration
m
I’m going through the plugin integrations trying to iron out the kinks in our configuration…. Setting the Ray integration has been very straightforward, and the example in the docs is mainly working, however the RayJob and there the RayCluster is not deleted once the Ray task has successfully finished. The Flyte role has permissions to do this:
Copy code
- apiGroups:
        - <http://ray.io|ray.io>
      resources:
        - rayjobs
      verbs:
        - "*"
And the RayJob is in a SUCCEEDED state
Copy code
apiVersion: <http://ray.io/v1alpha1|ray.io/v1alpha1>
kind: RayJob
metadata:
  creationTimestamp: '2023-07-18T20:09:53Z'
  finalizers:
    - <http://ray.io/rayjob-finalizer|ray.io/rayjob-finalizer>
  name: f9865b58322e24b91a6d-n0-0
  namespace: flyte-playground-development
  ownerReferences:
    - apiVersion: <http://flyte.lyft.com/v1alpha1|flyte.lyft.com/v1alpha1>
      blockOwnerDeletion: true
      controller: true
      kind: flyteworkflow
      name: f9865b58322e24b91a6d
      uid: dd5f635e-73b8-4641-b36f-96a47b39ce31
  resourceVersion: '391072831'
  uid: 6d09ea85-24f4-4191-a66c-098ceab3ad27
  ...
status:
  endTime: '2023-07-18T20:10:13Z'
  jobDeploymentStatus: Running
  jobId: f9865b58322e24b91a6d-n0-0-9jcjv
  jobStatus: SUCCEEDED
There isn’t anything in the logs to suggest propeller is having an issue removing it? Although I guess my question is this - Is it Flyte or Ray that is responsible for cleaning up the RayJob/RayCluster? I’m running Flyte 1.7.0 and ray-operator 1.5.2 which I’ve seen others say is working from them? Any ideas?
k
you could set ttl for ray cluster. in propeller configmap
Copy code
k8s.yaml: |
  plugins:
    ray:
      ttlSecondsAfterFinished: 30
m
Thank you Kevin, this works perfectly - I’ll try and raise a PR to add this and some other bits the docs 👍
k
thank you so much 😀