-
Notifications
You must be signed in to change notification settings - Fork 212
feat(cli): support dynamic port assignment for metrics server #3240
base: main
Are you sure you want to change the base?
feat(cli): support dynamic port assignment for metrics server #3240
Conversation
Change `init_prometheus_server` to manually bind and serve metrics, allowing port 0 to be used for OS-assigned port allocation. The actual bound address is now returned and logged. This follows the same pattern used by reth, where the recorder is built without the built-in HTTP server and metrics are served manually via hyper. BREAKING CHANGE: `init_metrics()` and `init_prometheus_server()` are now async and must be called from within a tokio runtime.
Wiz Scan Summary
To detect these findings earlier in the dev lifecycle, try using Wiz Code VS Code Extension. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds support for dynamic port assignment (port 0) for the Prometheus metrics server by replacing the built-in HTTP listener from metrics-exporter-prometheus with a manually managed hyper server. This allows the actual bound port to be discovered and returned to callers, which was previously impossible with the library's API.
Key changes:
- Manual metrics serving via hyper instead of the built-in HTTP listener
- Async initialization to support tokio-based TcpListener binding
- Returns the actual bound SocketAddr to support dynamic port assignment
Reviewed changes
Copilot reviewed 5 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| crates/utilities/cli/src/prometheus.rs | Implements manual metrics serving with hyper, replaces sync init with async, adds comprehensive design documentation |
| crates/utilities/cli/src/error.rs | Adds PrometheusError type for better error handling of binding and recorder initialization |
| crates/utilities/cli/src/flags/metrics.rs | Updates MetricsArgs methods to be async and adds init_metrics_with_addr() to return actual bound address |
| crates/utilities/cli/src/lib.rs | Exports PrometheusError for public API |
| crates/utilities/cli/Cargo.toml | Adds hyper, hyper-util, tokio dependencies for manual metrics serving |
| bin/supervisor/src/cli.rs | Moves metrics initialization inside tokio runtime (async block) |
| bin/node/src/flags/metrics.rs | Updates init_unified_metrics to be async |
| bin/node/src/cli.rs | Moves metrics initialization inside tokio runtime for async commands |
| Cargo.lock | Updates dependency graph with new direct dependencies |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| hyper = { version = "1.6", features = ["server", "http1"] } | ||
| hyper-util = { version = "0.1", features = ["tokio"] } |
Copilot
AI
Jan 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider adding hyper and hyper-util to the workspace dependencies in the root Cargo.toml. This would ensure consistent versions across the workspace and make it easier to manage updates. Currently, these dependencies are specified with explicit versions only in this crate, while hyper is already transitively used in the project (version 1.8.1). Using workspace dependencies would follow the pattern used for other shared dependencies like tokio, http, and http-body-util.
| hyper = { version = "1.6", features = ["server", "http1"] } | |
| hyper-util = { version = "0.1", features = ["tokio"] } | |
| hyper = { workspace = true, features = ["server", "http1"] } | |
| hyper-util = { workspace = true, features = ["tokio"] } |
| let response = Response::builder() | ||
| .header(CONTENT_TYPE, "text/plain; charset=utf-8") | ||
| .body(Full::new(Bytes::from(metrics))) | ||
| .unwrap(); |
Copilot
AI
Jan 9, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Response::builder().body().unwrap() call should handle errors properly instead of using unwrap(). While it's unlikely to fail in this specific case (the inputs are valid), using unwrap() can cause the entire task to panic if something unexpected happens. Consider using expect() with a descriptive message or properly handling the error case.
| .unwrap(); | |
| .expect("failed to build HTTP response for Prometheus metrics"); |
|
Closes #2987 |
Codecov Report❌ Patch coverage is
☔ View full report in Codecov by Sentry. |
Problem
When
KONA_METRICS_PORT=0was set, the OS would assign an available port, but there was no way to discover what port was actually assigned. The log message would show:0instead of the actual port, making it impossible to connect to the metrics endpoint.Solution
Replaced the
metrics-exporter-prometheusbuilt-in HTTP listener with a manually managed server. The implementation:TcpListenerdirectly (supporting port 0)local_addr()PrometheusBuilder::build_recorder()without the HTTP listenerhyperSocketAddrto callersDesign Rationale
The
metrics-exporter-prometheuscrate'swith_http_listener().install()API doesn't expose the actual bound address—it only accepts aSocketAddrand returnsResult<(), BuildError>. There's no way to query what port the OS assigned.We considered two approaches:
We chose manual serving, which is the same pattern used by reth's metrics infrastructure. This approach:
Breaking Changes
init_prometheus_server()is nowasyncand takesSocketAddrinstead of(IpAddr, u16)MetricsArgs::init_metrics()is nowasyncCall sites in
kona-nodeandkona-supervisorhave been updated to initialize metrics inside their async blocks.Usage