feat(cli): support dynamic port assignment for metrics server #3240

teddyknox · 2026-01-09T21:28:02Z

Problem

When KONA_METRICS_PORT=0 was set, the OS would assign an available port, but there was no way to discover what port was actually assigned. The log message would show :0 instead of the actual port, making it impossible to connect to the metrics endpoint.

Solution

Replaced the metrics-exporter-prometheus built-in HTTP listener with a manually managed server. The implementation:

Binds a TcpListener directly (supporting port 0)
Retrieves the actual bound address via local_addr()
Uses PrometheusBuilder::build_recorder() without the HTTP listener
Serves metrics manually via hyper
Returns the actual SocketAddr to callers

Design Rationale

The metrics-exporter-prometheus crate's with_http_listener().install() API doesn't expose the actual bound address—it only accepts a SocketAddr and returns Result<(), BuildError>. There's no way to query what port the OS assigned.

We considered two approaches:

Pre-bind and hope: Bind a listener to get a port, drop it, pass that port to the builder. This has a race condition where another process could grab the port.
Manual serving: Build the recorder without HTTP, bind our own listener, serve manually. No race condition.

We chose manual serving, which is the same pattern used by reth's metrics infrastructure. This approach:

Eliminates the race condition entirely (same listener is used throughout)
Gives full control over the HTTP server
Allows returning the actual bound address

Breaking Changes

init_prometheus_server() is now async and takes SocketAddr instead of (IpAddr, u16)
MetricsArgs::init_metrics() is now async
Metrics initialization must now happen inside a tokio runtime

Call sites in kona-node and kona-supervisor have been updated to initialize metrics inside their async blocks.

Usage

// Get the actual address (useful when port is 0)
if let Some(addr) = args.init_metrics_with_addr().await? {
    println!("Metrics available at http://{}", addr);
}

// Or just initialize without needing the address
args.init_metrics().await?;

Change `init_prometheus_server` to manually bind and serve metrics, allowing port 0 to be used for OS-assigned port allocation. The actual bound address is now returned and logged. This follows the same pattern used by reth, where the recorder is built without the built-in HTTP server and metrics are served manually via hyper. BREAKING CHANGE: `init_metrics()` and `init_prometheus_server()` are now async and must be called from within a tokio runtime.

wiz-b4c72f16a4 · 2026-01-09T21:28:29Z

Wiz Scan Summary

Scanner	Findings
Vulnerabilities	4
Sensitive Data	-
Secrets	-
IaC Misconfigurations	-
SAST Findings	-
Software Supply Chain Findings	-

Total	4

View scan details in Wiz

To detect these findings earlier in the dev lifecycle, try using Wiz Code VS Code Extension.

Copilot

Pull request overview

This PR adds support for dynamic port assignment (port 0) for the Prometheus metrics server by replacing the built-in HTTP listener from metrics-exporter-prometheus with a manually managed hyper server. This allows the actual bound port to be discovered and returned to callers, which was previously impossible with the library's API.

Key changes:

Manual metrics serving via hyper instead of the built-in HTTP listener
Async initialization to support tokio-based TcpListener binding
Returns the actual bound SocketAddr to support dynamic port assignment

Reviewed changes

Copilot reviewed 5 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
crates/utilities/cli/src/prometheus.rs	Implements manual metrics serving with hyper, replaces sync init with async, adds comprehensive design documentation
crates/utilities/cli/src/error.rs	Adds PrometheusError type for better error handling of binding and recorder initialization
crates/utilities/cli/src/flags/metrics.rs	Updates MetricsArgs methods to be async and adds init_metrics_with_addr() to return actual bound address
crates/utilities/cli/src/lib.rs	Exports PrometheusError for public API
crates/utilities/cli/Cargo.toml	Adds hyper, hyper-util, tokio dependencies for manual metrics serving
bin/supervisor/src/cli.rs	Moves metrics initialization inside tokio runtime (async block)
bin/node/src/flags/metrics.rs	Updates init_unified_metrics to be async
bin/node/src/cli.rs	Moves metrics initialization inside tokio runtime for async commands
Cargo.lock	Updates dependency graph with new direct dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-09T21:33:11Z

crates/utilities/cli/Cargo.toml

+hyper = { version = "1.6", features = ["server", "http1"] }
+hyper-util = { version = "0.1", features = ["tokio"] }


Consider adding hyper and hyper-util to the workspace dependencies in the root Cargo.toml. This would ensure consistent versions across the workspace and make it easier to manage updates. Currently, these dependencies are specified with explicit versions only in this crate, while hyper is already transitively used in the project (version 1.8.1). Using workspace dependencies would follow the pattern used for other shared dependencies like tokio, http, and http-body-util.

Suggested change

hyper = { version = "1.6", features = ["server", "http1"] }

hyper-util = { version = "0.1", features = ["tokio"] }

hyper = { workspace = true, features = ["server", "http1"] }

hyper-util = { workspace = true, features = ["tokio"] }

Copilot · 2026-01-09T21:33:11Z

crates/utilities/cli/src/prometheus.rs

+                    let response = Response::builder()
+                        .header(CONTENT_TYPE, "text/plain; charset=utf-8")
+                        .body(Full::new(Bytes::from(metrics)))
+                        .unwrap();


The Response::builder().body().unwrap() call should handle errors properly instead of using unwrap(). While it's unlikely to fail in this specific case (the inputs are valid), using unwrap() can cause the entire task to panic if something unexpected happens. Consider using expect() with a descriptive message or properly handling the error case.

Suggested change

.unwrap();

.expect("failed to build HTTP response for Prometheus metrics");

op-will · 2026-01-09T21:35:43Z

Closes #2987

codecov · 2026-01-09T21:54:41Z

Codecov Report

❌ Patch coverage is 0% with 46 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.9%. Comparing base (5a4457f) to head (d5f92da).
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
crates/utilities/cli/src/prometheus.rs	0.0%	37 Missing ⚠️
crates/utilities/cli/src/flags/metrics.rs	0.0%	7 Missing ⚠️
bin/supervisor/src/cli.rs	0.0%	2 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (5a4457f) and HEAD (d5f92da). Click for more details.

HEAD has 20 uploads less than BASE

Flag BASE (5a4457f) HEAD (d5f92da)

proof 7 0

unit 2 1

e2e 12 0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…s-server

Copilot AI review requested due to automatic review settings January 9, 2026 21:28

teddyknox requested review from BioMark3r, einar-oplabs, emhane, op-will and theochap as code owners January 9, 2026 21:28

Copilot started reviewing on behalf of teddyknox January 9, 2026 21:28 View session

Copilot AI reviewed Jan 9, 2026

View reviewed changes

Merge branch 'main' into teddyknox/add-dynamic-port-support-to-metric…

d5f92da

…s-server

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cli): support dynamic port assignment for metrics server #3240

feat(cli): support dynamic port assignment for metrics server #3240

Uh oh!

teddyknox commented Jan 9, 2026

Uh oh!

wiz-b4c72f16a4 bot commented Jan 9, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jan 9, 2026

Uh oh!

Copilot AI Jan 9, 2026

Uh oh!

op-will commented Jan 9, 2026

Uh oh!

codecov bot commented Jan 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		hyper = { version = "1.6", features = ["server", "http1"] }
		hyper-util = { version = "0.1", features = ["tokio"] }

	.unwrap();
	.expect("failed to build HTTP response for Prometheus metrics");

feat(cli): support dynamic port assignment for metrics server #3240

Are you sure you want to change the base?

feat(cli): support dynamic port assignment for metrics server #3240

Uh oh!

Conversation

teddyknox commented Jan 9, 2026

Problem

Solution

Design Rationale

Breaking Changes

Usage

Uh oh!

wiz-b4c72f16a4 bot commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Wiz Scan Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

op-will commented Jan 9, 2026

Uh oh!

codecov bot commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wiz-b4c72f16a4 bot commented Jan 9, 2026 •

edited

Loading

codecov bot commented Jan 9, 2026 •

edited

Loading