-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Summary
This issue tracks the introduction of first‑class node telemetry for Neo N3 via a new Prometheus‑exporting plugin, health endpoints, and accompanying monitoring assets.
Motivation
Node operators currently rely on ad‑hoc log inspection and RPC polling for visibility. A built‑in telemetry surface enables:
- consistent operational monitoring across deployments
- alerting on sync/peer/mempool/resource health
- easy integration with existing Prometheus/Grafana stacks
Scope / PR stack
The work is delivered as four small, stacked PRs targeting the telemetry staging branch:
-
Core hooks – [Telemetry 1/4] Add telemetry hooks to Neo core #4372 (
telemetry-1)- Adds non‑intrusive P2P events in
RemoteNode(MessageSent,ConnectionChanged) for plugins to observe network activity. - Exposes
Plugin.IsStoppedasprotected internalso plugins can self‑disable safely.
- Adds non‑intrusive P2P events in
-
Telemetry plugin & metrics – [Telemetry 2/4] Add Telemetry plugin with metrics collection #4373 (
telemetry-2)- Introduces the
Neo.Plugins.Telemetryplugin with collectors for blockchain, network, mempool, system, and plugin state. - Exposes a Prometheus endpoint (default
http://localhost:9100/metrics). - Uses consistent labels:
node_id,network.
- Introduces the
-
Health endpoints – [Telemetry 3/4] Add health check endpoints for telemetry #4374 (
telemetry-3)- Adds lightweight HTTP
/health,/ready, and/liveendpoints. - Readiness reflects both sync status and peer connectivity.
- Adds lightweight HTTP
-
Docs & monitoring assets – [Telemetry 4/4] Add telemetry documentation and monitoring configs #4375 (
telemetry-4)- Comprehensive README with install/config/metric reference.
- Prometheus scrape snippet and alert rules.
- Grafana dashboard JSON for node operators.
PRs should be merged in order 1 → 4. After that, the telemetry branch can be fast‑forwarded into the target release branch.
Usage
Enable the plugin via Plugins/Telemetry/config.json, then configure Prometheus to scrape:
scrape_configs:
- job_name: neo-node
static_configs:
- targets: ['<host>:9100']Health checks:
GET /healthGET /readyGET /live
Validation
- The plugin builds cleanly at each PR boundary.
- Collectors are guarded so telemetry cannot destabilize the node unless explicitly configured with
ExceptionPolicy = StopNode.
Follow‑ups
- Add unit/integration tests for collectors once a Telemetry test harness is agreed.
- Consider more granular per‑peer metrics and optional OpenTelemetry traces.