hi team, we're having issues with metric collectio...
# announcements
hi team, we're having issues with metric collection on the recent flyteadmin. Telegraf starts throwing lots of these after just a day or two of uptime and some metrics disappear from the dashboard:
Error in plugin [inputs.prometheus]: error reading body: net/http: request canceled (Client.Timeout exceeded while reading body)
original settings were:
Copy code
interval = "10s"
response_timeout = "3s"
I changed these to 30s and 15s and it seems to be ok for now. Do you have a recommendation on that these should be set to?
hmm this is interesting
the timeout is from flytadmin?
that is interesting - seems admin is under load?
@Ketan (kumare3) yes, flyteadmin. No significant load, CPU < 20%