Estimate timing overhead, allow failing if it is too high #3209

travisdowns · 2026-01-20T03:42:53Z

A recent "audit" of Redpanda microbenchmarks showed that many were returning essentially useless results as the overhead of reading the performance counters was much higher than the signal they are trying to measure: in several cases more than 99% of the test was spent just reading the counters.

This is a problem that can't just be fought with documentation (though it helps) - so instead estimate the overhead and warn if it exceeds 10% (default, really you should aim for < 1%).

First commits refactor a bit the odd way we were passing around the perf counters and including them in the header for no good reason.

Updates the docs.

Example output after this change, showing 99.2% overhead in this artificial test:

inja perf_tests_perf && tests/perf/perf_tests_perf -t perf_tests.test_timer_overhead
[3/3] Linking CXX executable tests/perf/perf_tests_perf
WARN  2026-01-20 00:20:46,432 seastar - Could not parse cgroups v1 file (memory/memory.limit_in_bytes).
INFO  2026-01-20 00:20:46,435 seastar - Reactor backend: io_uring
INFO  2026-01-20 00:20:46,474 [shard 20:main] io - IO queue was unable to find a suitable maximum request length, the search was cut-off early at: 16MB
INFO  2026-01-20 00:20:46,474 [shard 20:main] io - IO queue was unable to find a suitable maximum request length, the search was cut-off early at: 16MB
single run iterations:    0
single run duration:      1.000s
number of runs:           5
number of cores:          32
random seed:              2037298104
start/stop overhead:      1.023µs (1.602µs)

test                                iters            runtime     allocs      tasks       inst     cycles   overhead
perf_tests.test_timer_overhead     486000     1.03µs ± 0.07%      0.000      0.000     317.01      351.3       99.2

Move the linux_perf_event counters (_instructions_retired_counter and _cpu_cycles_retired_counter) from performance_test into the time_measurement singleton object. This simplifies the interface by making counter management an internal concern of time_measurement. Add enable_counters() and disable_counters() methods to time_measurement to control the hardware performance counters, and update start_run() to no longer take counter pointers as parameters.

Move the time_measurement class from the header to perf_tests.cc, making it an internal implementation detail. The extern global is replaced with a file-static instance. Add start_run() and stop_run() protected member functions to performance_test that handle enabling/disabling counters and delegating to the time_measurement singleton. This reduces the public API surface in the header. The only functions still exposed in the header are time_measurement_start_iteration/stop_iteration, which are needed by the public start_measuring_time/stop_measuring_time API.

These attributes don't make sense now that the methods are in the .cc file rather than the header.

Add tracking of start_measuring_time()/stop_measuring_time() call overhead: - Calibrate start/stop overhead at startup by timing 1000 iterations - Track start_stop_count in time_measurement to count timing calls per run - Calculate overhead percentage: (start_stop_count * overhead_ns) / runtime - Add "overhead" column to text and JSON output - Print calibrated overhead in configuration header - Warn to stderr when overhead exceeds threshold (default 10%) - Add --overhead-threshold option to configure warning threshold - Add --fail-on-high-overhead option to fail test run on high overhead Also adds a test case that deliberately has high overhead to verify the feature works correctly.

Copilot

Pull request overview

This PR adds measurement overhead estimation to the Redpanda microbenchmark framework to identify tests where timing instrumentation consumes a significant portion of measured time, making results unreliable.

Changes:

Refactored time_measurement class from header to implementation file, removing unnecessary exposure of performance counters
Added overhead calibration at startup and tracking of start_measuring_time()/stop_measuring_time() call counts
Introduced overhead column in results, --overhead-threshold and --fail-on-high-overhead command-line options

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File	Description
tests/perf/perf_tests_perf.cc	Adds test case to verify overhead measurement functionality
tests/perf/perf_tests.cc	Implements overhead calibration, tracking, and reporting with configurable thresholds
tests/perf/perf-tests.md	Documents overhead column, warnings, and new command-line options
include/seastar/testing/perf_tests.hh	Moves `time_measurement` class to implementation and updates API signatures

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/perf/perf_tests.cc

Copilot · 2026-01-20T20:07:32Z

tests/perf/perf_tests.cc

+                // Calculate overhead percentage from start/stop timing calls
+                double overhead_ns = rr.start_stop_count * measure_time.start_stop_overhead_ns();
+                double overhead_pct = ns > 0 ? (overhead_ns / ns) * 100.0 : 0.0;
+                r.overhead.add(overhead_pct, 1.0);  // already per-run, not per-iteration


The second argument 1.0 to add() is unclear without context. Consider adding a comment explaining what this weight parameter represents, or use a named constant like WEIGHT_PER_RUN to improve readability.

Suggested change

r.overhead.add(overhead_pct, 1.0); // already per-run, not per-iteration

constexpr double overhead_weight_per_run = 1.0; // already per-run, not per-iteration

r.overhead.add(overhead_pct, overhead_weight_per_run);

Copilot · 2026-01-20T20:07:32Z

tests/perf/perf_tests.cc

+    // Check if overhead exceeds threshold and warn
+    double median_overhead = r.overhead.stats().med;
+    if (median_overhead > conf.overhead_threshold) {
+        fmt::print("WARNING: test '{}' has high measurement overhead: {:.1f}% (threshold: {:.1f}%)\n",


The warning is printed to stdout, which may be missed in automated test runs or get lost in output. Consider using a logging framework (e.g., perf_logger.warn()) or printing to stderr to ensure visibility of this important diagnostic message.

Suggested change

fmt::print("WARNING: test '{}' has high measurement overhead: {:.1f}% (threshold: {:.1f}%)\n",

fmt::print(stderr,

"WARNING: test '{}' has high measurement overhead: {:.1f}% (threshold: {:.1f}%)\n",

The rest of the test output goes to stdout, we want this to go to the same place. stderr is already heavily used by seastar loggers so it is common to run benchmarks with 2>/dev/null or similar redirection.

Copilot · 2026-01-20T20:07:32Z

tests/perf/perf-tests.md

 #### Measurement overhead

-The cost of starting and stopping the timers is substantial, about 0.5μs for a start/stop pair, due to the overhead of reading the performance counters in the kernel. In the case you use manual time measurement with start/stop_measuring_time you should ensure that the timed region is substantially longer than time in order to reduce the error imposed by this overhead: for example by using a loop inside the timed region and returning the number of iterations from the test method.
+The cost of starting and stopping the timers is substantial, about 1μs for a start/stop pair, due to the overhead of reading the performance counters in the kernel. When using manual time measurement with `start_measuring_time()`/`stop_measuring_time()`, you should ensure that the timed region is substantially longer than this overhead to reduce measurement error. A common approach is to use a loop inside the timed region and return the number of iterations from the test method.


The documentation states the overhead is 'about 1μs' but the PR description mentions it was previously documented as '0.5μs'. Since the actual overhead is now calibrated at runtime and displayed in the output, consider rephrasing to indicate this is approximate and varies by system, e.g., 'typically around 1μs' or 'approximately 1μs on most systems'.

is "about" not good enough?

travisdowns added 5 commits January 19, 2026 23:53

perf_tests: remove inline/hot attributes from time_measurement methods

70a4cb3

These attributes don't make sense now that the methods are in the .cc file rather than the header.

perf_tests: document overhead column and threshold options

96ce1ce

travisdowns force-pushed the td-perf-test-overhead branch from 96b07c4 to 96ce1ce Compare January 20, 2026 03:45

travisdowns requested review from bhalevy and Copilot January 20, 2026 20:06

Copilot AI reviewed Jan 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Estimate timing overhead, allow failing if it is too high #3209

Estimate timing overhead, allow failing if it is too high #3209

travisdowns commented Jan 20, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

travisdowns Jan 20, 2026

Uh oh!

Copilot AI Jan 20, 2026

Uh oh!

travisdowns Jan 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	r.overhead.add(overhead_pct, 1.0); // already per-run, not per-iteration
	constexpr double overhead_weight_per_run = 1.0; // already per-run, not per-iteration
	r.overhead.add(overhead_pct, overhead_weight_per_run);

	fmt::print("WARNING: test '{}' has high measurement overhead: {:.1f}% (threshold: {:.1f}%)\n",
	fmt::print(stderr,
	"WARNING: test '{}' has high measurement overhead: {:.1f}% (threshold: {:.1f}%)\n",

Estimate timing overhead, allow failing if it is too high #3209

Are you sure you want to change the base?

Estimate timing overhead, allow failing if it is too high #3209

Conversation

travisdowns commented Jan 20, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

travisdowns Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

travisdowns Jan 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant