hi, is there any option to configure deletion poli...
# ask-the-community
a
hi, is there any option to configure deletion policy for k8s pods that are on status Error? (we use flyte-core helm chart on gke)
v
Hello I was also going to configure this for my Flyte setup so it was a good opportunity to check how it’s done According to the flyte scheduler docs, there are two options which seem to be useful to achieve this: https://docs.flyte.org/en/latest/deployment/configuration/generated/scheduler_config.html First there’s
gc-interval
which is 30 minutes by default. It attempts to periodically delete FlyteWorkflow CRDs from the cluster, but this alone does not clean up pods There’s also the
delete-resource-on-finalize
option, which cleans up resources related to the FlyteWorkflow when it is terminated based on the FlyteWorkflow’s finalizer. The docs say:
Copy code
Instructs the system to delete the resource upon successful execution of a k8s pod rather than have the k8s garbage collector clean it up. This ensures that no resources are kept around (potentially consuming cluster resources). This, however, will cause k8s log links to expire as soon as the resource is finalized.
If this option frees cluster resources and causes logs to disappear, then it seems to be deleting pods and their containers, which is what we want to achieve. What’s not clear to me from this description is if it cleans up a pod after the pod completes, or after the workflow completes. The name of “delete-resource-on-finalize” seems to suggest it happens when the finalizer is activated on the FlyteWorkflow, but the description says it happens “upon successful execution of a k8s pod”. Either of those should be fine, let’s try configuring these and see if it works. You can configure these in the helm values under
configmap.schedulerConfig.scheduler
, like this:
Copy code
configmap:
  schedulerConfig:
    gc-interval: 10m #can leave blank for 30m
    delete-resource-on-finalize: true
I’ll have a chance to test this tomorrow so I’ll let you know if it actually works, meanwhile you can try it yourself
d
Thanks @Victor Churikov for being an excellent community member. Hoping to update/extend the docs with your findings