Open
Description
What version of nebula
are you using?
1.7.2
What operating system are you using?
Linux
Describe the Bug
Scraping nebula metrics i have found that a lot of metric entries have incorrect metric types.
curl localhost:9102/metrics
...
# HELP nebula_firewall_incoming_dropped_local_ip firewall.incoming.dropped.local_ip
# TYPE nebula_firewall_incoming_dropped_local_ip gauge
nebula_firewall_incoming_dropped_local_ip 0
# HELP nebula_firewall_incoming_dropped_no_rule firewall.incoming.dropped.no_rule
# TYPE nebula_firewall_incoming_dropped_no_rule gauge
nebula_firewall_incoming_dropped_no_rule 1693
# HELP nebula_firewall_incoming_dropped_remote_ip firewall.incoming.dropped.remote_ip
# TYPE nebula_firewall_incoming_dropped_remote_ip gauge
nebula_firewall_incoming_dropped_remote_ip 0
# HELP nebula_firewall_outgoing_dropped_local_ip firewall.outgoing.dropped.local_ip
# TYPE nebula_firewall_outgoing_dropped_local_ip gauge
nebula_firewall_outgoing_dropped_local_ip 37005
# HELP nebula_firewall_outgoing_dropped_no_rule firewall.outgoing.dropped.no_rule
# TYPE nebula_firewall_outgoing_dropped_no_rule gauge
nebula_firewall_outgoing_dropped_no_rule 0
# HELP nebula_firewall_outgoing_dropped_remote_ip firewall.outgoing.dropped.remote_ip
# TYPE nebula_firewall_outgoing_dropped_remote_ip gauge
nebula_firewall_outgoing_dropped_remote_ip 0
...
# HELP nebula_handshake_manager_initiated handshake_manager.initiated
# TYPE nebula_handshake_manager_initiated gauge
nebula_handshake_manager_initiated 129
# HELP nebula_handshake_manager_timed_out handshake_manager.timed_out
# TYPE nebula_handshake_manager_timed_out gauge
nebula_handshake_manager_timed_out 85
...
# HELP nebula_messages_rx_recv_error messages.rx.recv_error
# TYPE nebula_messages_rx_recv_error gauge
nebula_messages_rx_recv_error 51
# HELP nebula_messages_tx_punchy messages.tx.punchy
# TYPE nebula_messages_tx_punchy gauge
nebula_messages_tx_punchy 1.359801e+06
# HELP nebula_messages_tx_recv_error messages.tx.recv_error
# TYPE nebula_messages_tx_recv_error gauge
nebula_messages_tx_recv_error 39
# HELP nebula_network_packets_duplicate network.packets.duplicate
# TYPE nebula_network_packets_duplicate gauge
nebula_network_packets_duplicate 0
# HELP nebula_network_packets_lost network.packets.lost
# TYPE nebula_network_packets_lost gauge
nebula_network_packets_lost 765710
# HELP nebula_network_packets_out_of_window network.packets.out_of_window
# TYPE nebula_network_packets_out_of_window gauge
nebula_network_packets_out_of_window 0
...
At least these metrics needs to be changed from gauge to counter
The help comments aren't that helpful either 😄 But I can live with that.
Some parts of the code know that certain metrics are counters https://github.com/slackhq/nebula/blob/master/bits.go#L13, but for some reason it's not being propagated to the prometheus output properly.
Logs from affected hosts
No response
Config files from affected hosts
No response