Vinícius Sosnowski01/30/2023, 3:47 PM
Geoff Salmon01/31/2023, 3:11 AM
and not to
. Having said that, I don't know why flyteadmin service has the
port to load-balance the metrics endpoint. Maybe there is a situation where scraping the service makes sense, but I can't see it.
David Espejo (he/him)02/01/2023, 12:40 AM
you're using in Prometheus? If using the
label (which by default sets to the Pod´s name) that would probably be inconsistent if some Pod gets redeployed, so in such case using relabeling+meta labels could help Then you could query for the metric's average of the three pods
Vinícius Sosnowski02/02/2023, 1:54 PM
at the moment but indeed we are using the
label defaulting to the Pod's name. I checked this article https://medium.com/kubehells/kubernetes-pod-monitors-re-labeling-managing-cardinality-6d38eea748d6 to see an example but I guess he only has one pod inside the node? Because I'm wondering what the result would be if 3 pods
label be renamed to a single thing. Will the metrics be aggregated?
David Espejo (he/him)02/02/2023, 7:08 PM
Geoff Salmon02/06/2023, 3:31 PM
on the individual counter metrics produced by each pod and then
over all the pods. https://prometheus.io/docs/practices/histograms/#count-and-sum-of-observations shows computing average rates from the sum or bucket time series of a histogram and the count timeseries.
(The metric names are invented in this example)
sum(rate(workflow_duration_sum[1h])) / sum(rate(workflow_duration_count[1h]))
Vinícius Sosnowski02/07/2023, 9:59 PM