Open
Description
Hello
Could someone help fix a problem in my configuration?
I have multiple locations for scraping with PushProx, and from time to time In logs, I see messages like below
PushProx logs
ts=2024-11-11T13:10:59.586Z caller=main.go:179 level=error msg="Error scraping:" err="Timeout reached for \"http://10.70.67.5:9070/metrics\": context canceled" url=http://10.70.67.5:9070/metrics
ts=2024-11-11T13:10:59.874Z caller=main.go:179 level=error msg="Error scraping:" err="Timeout reached for \"http://10.90.63.85:9070/metrics\": context deadline exceeded" url=http://10.90.63.85:9070/metrics
Ingress-Nginx logs
2024/11/11 13:18:59 [error] 29#29: *17648707 upstream timed out (110: Operation timed out) while reading response header from upstream, client: 213.197.78.89, server: pushprox.some.domain, request: "POST /poll HTTP/1.1", upstream: "http://10.111.119.118:9080/poll", host: "pushprox.some.domain,"
The main issue that I'm receiving alerts from AlertManager is that hosts went offline, but actually, it is the wrong alert
It appears randomly for random hosts, despite this those hosts are available for scraping.
My configuration is running in EKS under nginx-ingress controller
Metadata
Metadata
Assignees
Labels
No labels