Description
The issue appears to be the use of max(node_time_seconds{instance=~"$node:.*"})
which appears to not work in Prometheus 2.19, and a possible change in how Bind 9.16 reports uptime. The result is that changing the queries to time() - max(bind_boot_time_seconds{instance=~"$node:.*"})
produces sensible seeming results, but these are actually still off by an order of magnitude.
i.e. a Bind 9.16 reported boot time of 2020-07-14T21:10:48.999Z, with a current time of 2020-07-14T22:11:56.299Z will report incorrectly with the above, claiming 5.8 hours. I thought I had found the order of magnitude error, but then I noticed that something is still wrong because it wasn't updating correctly. That's when I noticed that bind_boot_time_seconds
was moving. It went from 1594766106
to 1594740334
, which is definitely not correct.
The actual Bind statistics do not reflect a change in the corresponding XML or JSON.