-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Blackbox exporter probes only one DNS A record (no failover), causing false negatives for multi-AZ ALB endpoints
Description
As per document ,blackbox_exporter performs a single DNS resolution and attempts a single connection to one IP address. There is no “happy eyeballs” style parallelization or retry.
This behavior causes issues when monitoring DNS names backed by multiple IPs (for example AWS ALB across AZs), where one IP may be temporarily unreachable while others are healthy.
What did you do?
Deployed blackbox_exporter to monitor an HTTPS endpoint (https://www.url.com) that is fronted by an AWS Application Load Balancer (ALB) spanning two Availability Zones.
The DNS name resolves to two IPs:
$ nslookup www.url.com
Name: www.url.com
Address: 123.12.12.01
Name: www.url.com
Address: 123.12.12.02
One of the IPs intermittently times out from the EKS cluster (cross-AZ), while the other IP is healthy.
When probing with blackbox exporter:
curl "http://blackbox:9115/probe?target=https://www.url.com&module=http_2xx"
the exporter attempts only one of the IPs and fails if that IP times out.
What did you expect to see?
probe_success = 1
when any resolved IP responds successfully (similar to standard HTTP client behavior such as curl or browsers, which retry/fail over to another IP).
What did you see instead? Under which circumstances?
probe_success = 0 when the first chosen IP times out, even though another IP for the same DNS name is healthy and serving traffic.
This happens when:
DNS name has multiple A records
One IP is unreachable or slow
Blackbox exporter does not retry or attempt the next IP
Reproduction steps
Configure an ALB-backed DNS name with multiple A records
Ensure one IP is slow/unreachable from the blackbox exporter pod
Run:
nslookup www.url.com
curl -v https://www.url.com # succeeds after retrying another IP
curl http://blackbox:9115/probe?target=https://www.url.com&module=http_2xx
Observe:
curl/browser succeeds
blackbox exporter returns probe_success = 0
Additional context
We understand that this behavior may be intentional and that blackbox exporter is designed to be deterministic and infrastructure-focused.
However, this commonly leads to confusion and “false negative” alerts when monitoring:
AWS ALBs / NLBs
Multi-AZ services
DNS-based load balancing
We are looking for:
Recommended best practices for monitoring such endpoints