feat(net): expose per-kind reputation-change and ban counters by constwz · Pull Request #180 · bnb-chain/reth

constwz · 2026-04-29T06:38:39Z

Summary

Adds per-`ReputationChangeKind` and per-outcome counters to the network's `PeersManager`, so reputation-driven peer drops are visible at the Prometheus layer.

Why

Today, a peer-pool drain induced by reputation bans is invisible at the metrics layer:

Banned peers go into the `ban_list`, which has no exposed gauge or counter.
The existing `DisconnectMetrics` counters cannot tell a graceful close apart from a rep-driven kick — both increment `network_disconnect_requested`.
`apply_reputation_change` only emits a `debug`/`info` log line per call (since fix: resolve some issues in cross region test #175), which is fine for forensics but useless for Grafana.

Concretely this affects BSC node operators investigating the peers drop to zero after sync pattern (bnb-chain/reth-bsc#320). Without these counters, distinguishing "we are banning peers because of repeated `BadBlock` penalties" from "peers are leaving us for unrelated reasons" requires log inspection. With them, a single PromQL `rate(network_reputation_changes_bad_block[5m])` correlated against `network_connected_peers` makes the diagnosis a panel.

New metrics

```
network_reputation_changes_bad_message
network_reputation_changes_bad_block
network_reputation_changes_bad_transactions
network_reputation_changes_bad_announcement
network_reputation_changes_already_seen_transaction
network_reputation_changes_timeout
network_reputation_changes_bad_protocol
network_reputation_changes_failed_to_connect
network_reputation_changes_dropped
network_reputation_changes_reset
network_reputation_changes_other
network_bans_total
network_disconnect_and_bans_total
network_unbans_total
```

Behaviour

The kind-counter increments before the trusted-peer / unknown-peer guards — it answers "what's hitting us." Outcome counters answer "did we punish for it."
Trusted-peer exemption is preserved.
No behaviour change beyond the new counters.

Test plan

`cargo check -p reth-network` — pass
`cargo +nightly clippy -p reth-network --tests --all-features` — no new warnings
`cargo +nightly fmt --check` — clean
`cargo nextest run -p reth-network` — 177 passed, 4 skipped

Refs bnb-chain/reth-bsc#320.

Adds a `ReputationMetrics` struct (scope `network`) with a `Counter` per `ReputationChangeKind` plus three outcome counters (`bans_total`, `disconnect_and_bans_total`, `unbans_total`), and instruments `PeersManager::apply_reputation_change` to increment them. New Prometheus metrics: network_reputation_changes_bad_message network_reputation_changes_bad_block network_reputation_changes_bad_transactions network_reputation_changes_bad_announcement network_reputation_changes_already_seen_transaction network_reputation_changes_timeout network_reputation_changes_bad_protocol network_reputation_changes_failed_to_connect network_reputation_changes_dropped network_reputation_changes_reset network_reputation_changes_other network_bans_total network_disconnect_and_bans_total network_unbans_total Diagnostic motivation: today, a peer-pool drain induced by reputation bans is invisible at the metrics layer. Banned peers go into the `ban_list`, which has no exposed gauge or counter, and the existing `DisconnectMetrics` counters cannot tell a graceful close apart from a rep-driven disconnect — both increment `disconnect_requested`. Concretely this affects BSC node operators investigating the "peers drop to zero after sync" pattern (bnb-chain/reth-bsc#320): without these counters, distinguishing "we are banning peers because of repeated `BadBlock` penalties" from "peers are leaving us for unrelated reasons" requires log inspection. With them, a single PromQL `rate(network_reputation_changes_bad_block[5m])` correlated against `network_connected_peers` makes the diagnosis a Grafana panel. The kind-counter increments before the trusted-peer / unknown-peer guards so it answers "what's hitting us" — outcome counters answer "did we punish for it". Trusted-peer exemption is preserved. No behaviour change beyond the new counters.

constwz requested a review from joey0612 as a code owner April 29, 2026 06:38

github-actions Bot added the S-stale label May 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(net): expose per-kind reputation-change and ban counters#180

feat(net): expose per-kind reputation-change and ban counters#180
constwz wants to merge 1 commit into
developfrom
metrics/reputation-changes-on-develop

constwz commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

constwz commented Apr 29, 2026

Summary

Why

New metrics

Behaviour

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants