[KLC-2418] seednode: reduce log noise + expose /peers, /node/status, /node/metrics#56
Conversation
…/node/metrics
The seednode was dumping its full connected-peer table at INFO every 5 s
(~190 lines per tick at production peer counts) on top of the concise
single-line summary already emitted at DEBUG by the libp2p messenger.
This drowned out actionable events and consumed disk fast under
--log-save, with no machine-readable surface for monitoring.
Logging (cmd/seednode/main.go):
- Print the "Seednode addresses:" table once at startup, not on every tick.
- Move the full connected-peers table to DEBUG.
- At INFO emit a single summary line per tick:
seednode status connected=N known=M gained=G lost=L listen=...
- Track the peer set across ticks so gained/lost report churn.
- Bump the tick interval from 5 s to 30 s.
- pickReportableListenAddress skips loopback and the Docker default
bridge (172.17.x) so the log line's listen= field shows the address
operators can actually use.
HTTP endpoints (cmd/seednode/api/api.go):
- GET /peers JSON: connectedPeers, knownPeers,
listenAddresses, connectedAddresses (raw).
- GET /node/status JSON: version, uptimeSeconds, connectedPeers,
knownPeers, listenAddresses.
- GET /node/metrics Prometheus exposition:
seednode_connected_peers
seednode_known_peers
seednode_uptime_seconds
klv_build_info{version,node_type}
- Refactor the package into a server struct hidden behind exported
Start(); /log websocket behavior unchanged.
- snapshot() reads the peer view once per handler so connected count
and connected-addresses slice stay consistent across the response.
- startTime captured at top of startNode so uptimeSeconds reflects
process start, not the post-bootstrap API listener.
API now starts after messenger.Bootstrap() since the new endpoints
query the messenger; external scrapers see a brief connection refused
during the bootstrap window instead of a /log-only gin server.
Tests (cmd/seednode/api/api_test.go): 5 httptest cases against a stub
peerInfoProvider cover the three handlers, empty-version build_info
omission, and Prometheus label escaping.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: 📒 Files selected for processing (3)
📜 Recent review details⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
🧰 Additional context used📓 Path-based instructions (2)**/*.go📄 CodeRabbit inference engine (Custom checks)
Files:
**/*_test.go⚙️ CodeRabbit configuration file
Files:
🧠 Learnings (1)📚 Learning: 2026-04-21T20:12:22.959ZApplied to files:
🔇 Additional comments (3)
WalkthroughRefactors the seednode HTTP API into a stateful ChangesSeednode API Refactoring
Sequence Diagram(s)sequenceDiagram
participant Node as startNode
participant Logger as logger
participant Messenger as P2P_Messenger
participant REST as REST_Services
Node->>Node: Capture startTime
Node->>Logger: SetLogLevel(cli)
Node->>Messenger: Create and bootstrap messenger
Node->>REST: startRestServices(ctx, marshalizer, messenger, startTime)
REST->>REST: startGinServer -> api.Start(interface, marshalizer, messenger, version, startTime)
REST->>REST: Register routes using server state
REST->>Messenger: server.snapshot() reads connected peers / listen addrs
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 6 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (6 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Pull request overview
This PR reduces operational log volume from the seednode by switching periodic peer logging from verbose tables to a concise status line, and it exposes additional HTTP endpoints (/peers, /node/status, /node/metrics) to support operator tooling and Prometheus scraping.
Changes:
- Replaces the 5s peer-table INFO log with a 30s single-line “seednode status” log, while keeping the full connected-peers table at DEBUG.
- Refactors
cmd/seednode/apito a statefulserverwith exportedStart(...), adding JSON status/peer endpoints and a Prometheus metrics endpoint. - Adds
httptestcoverage for the new HTTP handlers.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| cmd/seednode/main.go | Delays API start until after bootstrap; introduces 30s status ticker and debug-only peer table; adds listen-address selection helper for log output. |
| cmd/seednode/api/api.go | Refactors API into a server struct and adds /peers, /node/status, /node/metrics handlers. |
| cmd/seednode/api/api_test.go | Adds handler-level tests for the new JSON and Prometheus endpoints, including label escaping behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cmd/seednode/api/api_test.go`:
- Around line 132-168: Both tests (TestNodeMetrics_emptyVersionOmitsBuildInfo
and TestNodeMetrics_versionLabelEscaped) lack assertions on the HTTP response
status which can mask handler failures; after calling r.ServeHTTP(w, req) assert
that w.Code == http.StatusOK (or use http.StatusOK constant) and fail the test
if not before performing body checks so that a 4xx/5xx response cannot produce a
false positive; update both tests to check the status using the existing
recorder variable w immediately after ServeHTTP.
In `@cmd/seednode/main.go`:
- Around line 254-256: Replace the custom comparator sort.Slice call that uses
strings.Compare on the slice named connected with the idiomatic
sort.Strings(connected) call; specifically find the sort.Slice(connected,
func(i, j int) bool { return strings.Compare(connected[i], connected[j]) < 0 })
usage and swap it to sort.Strings(connected) to match the style used in api.go
and simplify the code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 54a0e402-649b-41dc-bd16-9a3a20caef05
📒 Files selected for processing (3)
cmd/seednode/api/api.gocmd/seednode/api/api_test.gocmd/seednode/main.go
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Agent
- GitHub Check: setup-and-lint / setup-and-lint
- GitHub Check: Analyze (go)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.go
📄 CodeRabbit inference engine (Custom checks)
**/*.go: Verify that any new or modified concurrent code (goroutines, channels, mutexes, sync primitives) is free of race conditions. Check for: proper lock/unlock pairing, no goroutine leaks, correct channel lifecycle management, and proper context cancellation propagation.
Verify that errors are not silently discarded. Check for: unchecked error returns, error wrapping with context, proper error propagation up the call chain, and no bare panic() calls outside of init() functions.
Files:
cmd/seednode/api/api_test.gocmd/seednode/main.gocmd/seednode/api/api.go
**/*_test.go
⚙️ CodeRabbit configuration file
**/*_test.go: Test files. Review for: - Adequate coverage of edge cases and error paths - Proper use of test helpers and assertions - Race condition coverage (tests should use -race flag patterns) - No hardcoded sleep for synchronization (use channels or sync primitives) - Test isolation (no shared mutable state between tests)
Files:
cmd/seednode/api/api_test.go
🧠 Learnings (1)
📚 Learning: 2026-04-21T20:12:22.959Z
Learnt from: phcarneirobc
Repo: klever-io/klever-go PR: 38
File: indexer/eventsProcessor.go:188-211
Timestamp: 2026-04-21T20:12:22.959Z
Learning: In Go structs that are JSON-marshaled, if a field is a `bool` and has the `json:"...,omitempty"` tag, then leaving that field at its zero value (`false`) is functionally equivalent (in the resulting JSON) to explicitly setting `Foundation: false`. Reviewers should not flag struct literals that omit such `bool` fields as an inconsistency; they will serialize identically because `omitempty` suppresses `false` values.
Applied to files:
cmd/seednode/api/api_test.gocmd/seednode/main.gocmd/seednode/api/api.go
🔇 Additional comments (16)
cmd/seednode/api/api_test.go (1)
15-130: LGTM!Also applies to: 170-178
cmd/seednode/api/api.go (8)
21-28: LGTM!
30-36: LGTM!
38-56: LGTM!
58-63: LGTM!
65-87: LGTM!
89-105: LGTM!
107-122: LGTM!Also applies to: 124-141
143-160: LGTM!cmd/seednode/main.go (7)
130-136: LGTM!
194-195: LGTM!
208-228: LGTM!
242-250: LGTM!
258-303: LGTM!
305-324: LGTM!
360-374: LGTM!
- api: gin.Default() -> gin.New() + Recovery in ReleaseMode so /node/metrics scrapes no longer emit a [GIN] access-log line per hit. Skip the network/api ginWriter redirect; routing gin.DefaultErrorWriter to log.Trace would have silenced panic stacks at production log levels, leaving them visible on stderr is better. - api: rewrite peerSnapshot doc to honestly describe what the helper guarantees (connected count == its slice; knownPeers may drift). - api_test: assert w.Code == http.StatusOK in the two metrics tests so a 4xx/5xx cannot pass them silently. Use strings.SplitSeq (Go 1.25). - main: sort.Slice(strings.Compare) -> sort.Strings for consistency with api.go.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cmd/seednode/api/api.go (1)
100-107:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdjust concurrency/mutation concern based on messenger implementations
network/p2p/libp2p/netMessenger.go’sConnectedAddresses()andAddresses()both build and return fresh[]stringslices (viamake(...)/append), sosnapshot()’ssort.Strings(connected)shouldn’t mutate provider-owned memory or race with the live messenger for that implementation.The “copy before sorting” suggestion is only necessary if other
messengerimplementations can return internal/shared slices—current evidence shows the teststubMessengerreturns backing slices directly (cmd/seednode/api/api_test.go), so sorting would mutate the stub during tests, but that’s not a production race.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cmd/seednode/api/api.go` around lines 100 - 107, snapshot() currently sorts the slice returned by s.messenger.ConnectedAddresses(), which can mutate caller-owned slices in tests (stubMessenger) — make a defensive copy of the slice returned by ConnectedAddresses() before calling sort.Strings to avoid mutating provider-owned memory; similarly, ensure any use of s.messenger.Addresses() that gets sorted or mutated is copied first. Locate the snapshot() function and the variables connected/listenAddrs and perform a copy (e.g., via append to a nil slice or make+copy) of the returned slices before any in-place modifications.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@cmd/seednode/api/api.go`:
- Around line 100-107: snapshot() currently sorts the slice returned by
s.messenger.ConnectedAddresses(), which can mutate caller-owned slices in tests
(stubMessenger) — make a defensive copy of the slice returned by
ConnectedAddresses() before calling sort.Strings to avoid mutating
provider-owned memory; similarly, ensure any use of s.messenger.Addresses() that
gets sorted or mutated is copied first. Locate the snapshot() function and the
variables connected/listenAddrs and perform a copy (e.g., via append to a nil
slice or make+copy) of the returned slices before any in-place modifications.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9e838833-b95f-4866-9b5e-6ab816d3f44a
📒 Files selected for processing (3)
cmd/seednode/api/api.gocmd/seednode/api/api_test.gocmd/seednode/main.go
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: setup-and-lint / setup-and-lint
- GitHub Check: Analyze (go)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.go
📄 CodeRabbit inference engine (Custom checks)
**/*.go: Verify that any new or modified concurrent code (goroutines, channels, mutexes, sync primitives) is free of race conditions. Check for: proper lock/unlock pairing, no goroutine leaks, correct channel lifecycle management, and proper context cancellation propagation.
Verify that errors are not silently discarded. Check for: unchecked error returns, error wrapping with context, proper error propagation up the call chain, and no bare panic() calls outside of init() functions.
Files:
cmd/seednode/api/api.gocmd/seednode/main.gocmd/seednode/api/api_test.go
**/*_test.go
⚙️ CodeRabbit configuration file
**/*_test.go: Test files. Review for: - Adequate coverage of edge cases and error paths - Proper use of test helpers and assertions - Race condition coverage (tests should use -race flag patterns) - No hardcoded sleep for synchronization (use channels or sync primitives) - Test isolation (no shared mutable state between tests)
Files:
cmd/seednode/api/api_test.go
🧠 Learnings (1)
📚 Learning: 2026-04-21T20:12:22.959Z
Learnt from: phcarneirobc
Repo: klever-io/klever-go PR: 38
File: indexer/eventsProcessor.go:188-211
Timestamp: 2026-04-21T20:12:22.959Z
Learning: In Go structs that are JSON-marshaled, if a field is a `bool` and has the `json:"...,omitempty"` tag, then leaving that field at its zero value (`false`) is functionally equivalent (in the resulting JSON) to explicitly setting `Foundation: false`. Reviewers should not flag struct literals that omit such `bool` fields as an inconsistency; they will serialize identically because `omitempty` suppresses `false` values.
Applied to files:
cmd/seednode/api/api.gocmd/seednode/main.gocmd/seednode/api/api_test.go
🔇 Additional comments (5)
cmd/seednode/api/api_test.go (3)
143-145: LGTM!
162-164: LGTM!
178-178: LGTM!cmd/seednode/api/api.go (1)
50-54: LGTM!Also applies to: 93-93
cmd/seednode/main.go (1)
252-254: LGTM!
The messenger interface does not document whether ConnectedAddresses returns a fresh slice or a view over internal state, and the test stub returns its backing slice directly. sort.Strings then mutates whatever it gets, so in tests the stub.connected field was being reordered in place and a future second call would have seen a tainted view. - api.go: copy before sort in server.snapshot. - main.go: same in emitPeerStatus. - api_test: regression test TestSnapshot_doesNotMutateMessengerSlices, verified failing before the fix.
|
Addressed the CodeRabbit outside-diff finding in b4b4091: |
nickgs1337
left a comment
There was a problem hiding this comment.
Not a big fan of manually manipulating strings to return the metrics. But, not a deal breaker
Lgtm
Summary
GET /peers,GET /node/status,GET /node/metrics(Prometheus). The latter clears the404that operator tooling has been hitting in the logs.cmd/seednode/apiinto aserverstruct behind exportedStart;/logwebsocket unchanged. Adds httptest coverage for the new handlers.INFO output, before / after
Before — every 5 s, ~190 lines at production peer count:
After — once at startup + 1 line per 30 s:
--log-level '*:DEBUG'still produces the full connected-peers table — no information loss.New HTTP surface
GET /node/status{ "version": "v…", "uptimeSeconds": 64, "connectedPeers": 187, "knownPeers": 218, "listenAddresses": ["/ip4/…", "/ip4/127.0.0.1/…", …] }GET /peers{ "connectedPeers": 187, "knownPeers": 218, "listenAddresses": [...], "connectedAddresses": [...sorted multiaddrs...] }GET /node/metricsBehavior notes
messenger.Bootstrap()(was before) because the new endpoints query the messenger. External scrapers will see a briefconnection refusedduring the bootstrap window instead of a/log-only gin server.uptimeSecondsis "since process start" (captured at the top ofstartNode), not "since API listener bound".pickReportableListenAddressskips/ip4/127.,/ip6/::1/and the Docker default bridge/ip4/172.17.for the log line only; JSON/Prometheus surfaces reportmessenger.Addresses()raw (ground truth for monitoring).Test plan
go build ./cmd/seednode/...go vet ./cmd/seednode/...go test ./cmd/seednode/...— 5 new tests passingseednode statusINFO line per 30 s tick,uptimeSecondsadvances across ticks,--log-level '*:DEBUG'reinstates the verbose table,/peers/node/status/node/metricsall return expected payloads.connected/knownstay at 0/1).References
Blockchain-Critical Impact Assessment
This PR has minimal blockchain-critical impact — it only touches the seednode (peer discovery/observability) and does not modify consensus, transaction processing, state management, or the KVM.
Affected Components
Stability and Safety Considerations
Concurrency and Startup Behaviour
Error Handling and Metrics
Tests and Build
Data Integrity