Hi I have Flyte deployed on a K8s cluster and I also have Pr Flyte #flyte-support

Hi! I have Flyte deployed on a K8s cluster and I a...

steep-parrot-14561

01/30/2023, 3:47 PM

Hi! I have Flyte deployed on a K8s cluster and I also have Prometheus set to scrape FlyteAdmin and FlytePropeller metrics. Thing is, I currently have 3 replicas of the FlytePropeller pod and so I have 3 values for each metric. I'm worried this might impact negatively the utility of these metric. As an example, I get 3 different workflow average durations, one for each pod, but I would like to know the average duration of all executions. Is there a way to solve this? For some metrics I think I could just sum them by the instance label (which is the pod), but I don't know if this will work everytime and it will also make the queries more complex.

steep-parrot-14561

01/30/2023, 4:01 PM

A secondary point: For FlyteAdmin there is a Kubernetes service in my cluster, which is also scraped by Prometheus (so for each metric I get 4 time series, 1 from the service and other 3, from each pod). How do the pods organize themselves to send the metrics in the service endpoint? For instance, the metric flyteadminadmincreate executionerrors has a non-zero value only for 1 pod. But the same metric, when coming from the service, alternates in time between 0 and the non-zero value, making it look like the service /metrics endpoint offer the metrics from 1 pod at a time, alternating randomly between them.

nutritious-london-39005

01/31/2023, 3:11 AM

For the second point, prometheus shouldn't scrape metrics from the service. "making it look like the service /metrics endpoint offer the metrics from 1 pod at a time" is exactly what's happening. The k8s Service exists to proxy requests to the FlyteAdmin pods. This is useful for the Flyte API but it produces nonsense metrics.

nutritious-london-39005

01/31/2023, 3:34 AM

Ensure you're adding

Copy code

<http://prometheus.io/scrape|prometheus.io/scrape>: "true"

only to

flyteadmin.podAnnotations

and not to

flyteadmin.service.annotations

. Having said that, I don't know why flyteadmin service has the

http-metrics

port to load-balance the metrics endpoint. Maybe there is a situation where scraping the service makes sense, but I can't see it.

average-finland-92144

02/01/2023, 12:40 AM

Hi @steep-parrot-14561 For point #1, could you share the

config.yml

you're using in Prometheus? If using the

instance

label (which by default sets to the Pod´s name) that would probably be inconsistent if some Pod gets redeployed, so in such case using relabeling+meta labels could help Then you could query for the metric's average of the three pods

steep-parrot-14561

02/02/2023, 1:54 PM

Hi guys! @nutritious-london-39005 thanks for the explanation, I will be asking the people responsible for this to change it to only scrape from pods. @average-finland-92144 sorry I don't have access to the

config.yml

at the moment but indeed we are using the

instance

label defaulting to the Pod's name. I checked this article https://medium.com/kubehells/kubernetes-pod-monitors-re-labeling-managing-cardinality-6d38eea748d6 to see an example but I guess he only has one pod inside the node? Because I'm wondering what the result would be if 3 pods

instance

label be renamed to a single thing. Will the metrics be aggregated?

average-finland-92144

02/02/2023, 7:08 PM

@steep-parrot-14561 for the example you sent, the selector will match any number of pods that match the label and, in that case, will proceed with relabelling. Inferring from the Prometheus docs, every relabelled Pod would be an instantiation of the same metric and every instance (or time series) could be aggregated using the supported PromQL operators

nutritious-london-39005

02/06/2023, 3:31 PM

I think using relabelling to try to combine the metrics is the wrong approach. Prometheus won't automatically aggregate them in a useful way if the same time series (same metric name and label set) is scraped from multiple end. The pods can have different lifetimes, and counter metrics start at zero when a pod starts running, so simply summing counters of all active pods will produce a counter that is no longer non-decreasing. Prometheus' ways of aggregating metrics, computing rates and averages, assume non-decreasing counters, so you're in for an uphill slog trying to do this. If you haven't seen them, check out the rate and increase functions. Generally you'll want to use

rate

increase

on the individual counter metrics produced by each pod and then

sum

over all the pods. https://prometheus.io/docs/practices/histograms/#count-and-sum-of-observations shows computing average rates from the sum or bucket time series of a histogram and the count timeseries.

thx 2

nutritious-london-39005

02/06/2023, 3:34 PM

Something like this would give you a time series containing average workflow durations over the past hour

Copy code

sum(rate(workflow_duration_sum[1h]))
/
sum(rate(workflow_duration_count[1h]))

(The metric names are invented in this example)

thx 1

steep-parrot-14561

02/07/2023, 9:59 PM

Hi guys! Yep, it seems indeed that using the proper functions and then summing over all the pods is an easy and straightforward way to obtain the results I want. Thank you for your time helping me through that =)

152 Views

Open in Slack

Previous Next