<@U06RKAMK0UC> Is there any prometheus metric for ...
# ask-ai
m
@ancient-wolf-19325 Is there any prometheus metric for Spark workflow running for long time?
j
It seems likely that there isn't a specific Prometheus metric for monitoring long-running Spark workflows in Flyte. However, you can monitor workflow execution times using general Prometheus metrics. According to this discussion, Flyte provides metrics for workflow execution, but they may not be directly tied to Spark workflows specifically. You might need to set up custom alerts based on execution duration metrics. Would you like more detail on how to set up such monitoring or alerts? Sources: - this discussion 👍 0 button 👎 0 button 💡 Hint: Mention @runllm in the thread for followups.
a
you could monitor it from the perspective of a workflow, just not having Spark-specific metrics What kind of metric are you looking for?
m
@average-finland-92144 Sometimes workflow remain in
running
state for long time (2-3 days or more as well) because sensor is waiting for underlying file to be available in S3 bucket. We would like to set monitoring alert in such cases. Example alert : xyz workflow is running for more than 24 hours.