Fix/streaming timestamps by mcurrier2 · Pull Request #105 · redhat-performance/rusty-comms

mcurrier2 · 2026-03-11T13:52:40Z

Description

Brief description of changes

Type of Change

[ x] Bug fix
New feature
Breaking change
Documentation update

Testing

[ x] Tests pass locally
[ x] Added tests for new functionality
[ x] Updated documentation

Checklist

[ x] Code follows style guidelines
[ x] Self-review completed
Comments added for complex code
[ x] Documentation updated
No breaking changes (or marked as breaking)

PR: Fix Streaming Output Timestamps Across All Code Paths

Summary

Streaming output timestamps (timestamp_ns in JSON/CSV per-message
records) were inaccurate across multiple code paths. Instead of
recording when each message was actually sent, timestamps were captured
at record-creation time — either post-test during batch iteration or
at file-read time. This caused all timestamps within a test run to
cluster into the same second, making time-series analysis of streaming
data meaningless.

Because this tool's primary purpose is precision measurement, incorrect
timestamps in the streaming output undermine the validity of per-message
latency records for any downstream analysis or visualization.

Branch: fix/streaming-timestamps
Base: main
Files changed: 11 (496 insertions, 60 deletions)

Root Cause

Commit e0ff7fa (Oct 8, 2025, PR #82) refactored the benchmark runner
to collect latencies inside spawned async futures, then batch-create
MessageLatencyRecord objects after the futures completed. This moved
the SystemTime::now() call from inside the hot loop (where it
reflected each message's send time) to the post-test iteration (where
it reflected the moment the record was created — all within the same
second).

The initial fix on this branch (68afe07) changed the
MessageLatencyRecord API to accept send_timestamp_ns as a
parameter, but only the blocking round-trip path was actually capturing
timestamps at send time. Three other code path families remained broken.

Problems and Fixes

1. Async round-trip timestamps captured post-test

Problem: The client future returned Vec<Duration>. After the
future completed, the post-test loop called current_timestamp_ns()
at record-creation time. All records received the same timestamp.

Fix: Changed the future to capture current_timestamp_ns() before
each send() and return Vec<(Duration, u64)>. The post-test loop
uses the captured timestamp.

// Inside the client future, before each send:
let wall_ts = MessageLatencyRecord::current_timestamp_ns();
let send_time = Instant::now();
client_transport.send(&message).await?;
// ...
latencies.push((send_time.elapsed(), wall_ts));

// Post-test loop:
for (i, (latency, wall_ts)) in latencies.iter().enumerate() {
    let record = MessageLatencyRecord::new(
        i as u64, mechanism, msg_size,
        LatencyType::RoundTrip, *latency, *wall_ts,
    );
}

Applies to: Both duration-based and count-based async round-trip
loops in src/benchmark.rs.

2. Async combined timestamps captured post-test

Problem: Same pattern as round-trip — the combined test (one-way +
round-trip in a single run) collected latency durations inside the
spawned future and created records post-test with stale timestamps.

Fix: Changed one_way_latencies to Vec<(Duration, u64)> to
carry the wall-clock timestamp alongside each measurement. The
timestamp is captured before each send() inside the future.

Applies to: Both duration-based and count-based async combined
loops in src/benchmark.rs.

3. One-way timestamps from file-read time instead of send time

Problem: One-way tests use a server process that measures receive
latency and writes results to a temporary file. The original format
was one latency_ns value per line. The client read this file
post-test and called current_timestamp_ns() at file-read time —
all records received the same timestamp.

Fix (server side): Both async and blocking server loops now write
wall_send_ns,latency_ns per line. The wall-clock send time is
computed as SystemTime::now() - latency_ns, approximating when the
message entered the IPC channel.

// Server, after measuring latency:
let wall_now_ns = SystemTime::now()
    .duration_since(UNIX_EPOCH)
    .unwrap_or_default()
    .as_nanos() as u64;
let wall_send_ns = wall_now_ns.saturating_sub(latency_ns);
writeln!(file, "{},{}", wall_send_ns, latency_ns).ok();

Fix (client side): A new parse_latency_file_line() function
parses the two-field format. Both async and blocking file readers
use the parsed wall_send_ns as send_timestamp_ns.

Applies to: src/main.rs (both server loops), src/benchmark.rs
(async reader), src/benchmark_blocking.rs (blocking reader).

Known Limitations

Wall-clock / monotonic clock mixing in one-way path

For one-way tests, the server computes the send timestamp by
subtracting the measured monotonic latency from its current wall-clock
time. This mixes two clock domains:

latency_ns = monotonic_receive - monotonic_send
(from message timestamp)
wall_send_ns = SystemTime::now() - latency_ns

If NTP adjusts the system clock between message send and receive, the
computed wall_send_ns will be slightly off. This is the best
approximation available without clock synchronization between the
client and server processes. The error is bounded by the magnitude
of any NTP adjustment during the test (typically microseconds).

Timestamp capture ordering

In the round-trip and combined futures, current_timestamp_ns() is
captured one instruction before Instant::now(). The wall-clock
timestamp therefore slightly predates the monotonic measurement start.
The gap is single-digit nanoseconds — orders of magnitude below the
IPC latencies being measured.

Tests Added

Unit tests for `parse_latency_file_line` (7 tests)

Test	Coverage
`test_parse_latency_file_line_valid`	Happy path: `"170...,42000"`
`test_parse_latency_file_line_zeros`	Edge case: `"0,0"`
`test_parse_latency_file_line_missing_comma`	Error: single value
`test_parse_latency_file_line_empty`	Error: empty string
`test_parse_latency_file_line_non_numeric_first`	Error: `"abc,789"`
`test_parse_latency_file_line_non_numeric_second`	Error: `"123,xyz"`
`test_parse_latency_file_line_extra_commas`	Error: `"1,2,3"`

Enhanced end-to-end streaming tests (2 tests)

Both test_one_way_streaming_captures_send_timestamp and
test_round_trip_streaming_captures_send_timestamp now validate:

All timestamp_ns values fall within the test execution window
(before_ns <= ts <= after_ns)
Timestamps are not all identical (which was the original bug)

Pre-existing timestamp API tests (3 tests in `results.rs`)

test_new_uses_provided_send_timestamp
test_new_combined_uses_provided_send_timestamp
test_current_timestamp_ns_returns_recent_value

Documentation Added

README.md: Streaming Output Columns

Added a new "Streaming Output Columns" section documenting all six
per-message streaming columns with types, descriptions, and nullable
semantics:

Column	Type	Description
`timestamp_ns`	`u64`	Wall-clock time message was sent
`message_id`	`u64`	Zero-based message identifier
`mechanism`	`string`	IPC mechanism name
`message_size`	`u64`	Payload size in bytes
`one_way_latency_ns`	`u64`/`null`	One-way latency
`round_trip_latency_ns`	`u64`/`null`	Round-trip latency

Includes a note on timestamp_ns accuracy for one-way tests
(wall-clock / monotonic clock mixing).

Validation

Unit and Integration Tests

$ cargo test
test result: ok. 265 passed; 0 failed; 1 ignored
(+ 42 doc tests, 27 integration tests — 334 total)

$ cargo clippy --all-targets -- -D warnings
# zero warnings

$ cargo fmt --check
# clean

$ scripts/msrv-check.sh
[MSRV] Rust 1.70 build/tests passed.

Functional Verification

3-second duration benchmarks across all mechanisms with streaming
output enabled. Timestamps validated to span the full test window:

Mechanism	Mode	First Timestamp	Last Timestamp	Delta (s)	Messages
UDS	Round-Trip	13:13:48.101628	13:13:51.101572	3.000	213,424
TCP	Round-Trip	13:13:55.154037	13:13:58.153894	3.000	132,423
SHM	One-Way	13:17:32.079542	13:17:35.079503	3.000	1,317,468
PMQ	Round-Trip	13:14:15.407160	13:14:18.362355	2.955	42,687

All mechanisms show timestamps distributed across the full test
duration. PMQ delta is 45ms short due to backpressure from shallow
system queue depth (typically 10 messages).

Before/After: Timestamp Distribution

Before fix (main branch): All timestamp_ns values in streaming
output were within the same second, regardless of test duration. A
3-second test with 200,000+ messages would show all timestamps
clustering around a single epoch second.

After fix: Timestamps span the full test duration. Each message's
timestamp_ns reflects the approximate wall-clock time it was sent,
enabling meaningful time-series analysis of per-message latency data.

Files Changed

File	Change
`src/benchmark.rs`	Capture `wall_ts` inside async round-trip and combined futures before each `send()`; update one-way file reader to parse new `wall_send_ns,latency_ns` format; add `parse_latency_file_line()` with 7 unit tests; enhance 2 end-to-end streaming tests with timestamp validation
`src/benchmark_blocking.rs`	Update blocking one-way file reader to parse new format using `parse_latency_file_line()`
`src/main.rs`	Both async and blocking server loops write `wall_send_ns,latency_ns` per line (was `latency_ns` only)
`README.md`	Add "Streaming Output Columns" section with column definitions and accuracy note
`src/results.rs`	`MessageLatencyRecord::new()` and `new_combined()` accept `send_timestamp_ns` parameter; add `current_timestamp_ns()` helper; 3 API tests
`src/ipc/tcp_socket.rs`	Refactor `is_some()/unwrap()` to `if let` (clippy fix)
`src/ipc/unix_domain_socket.rs`	Same clippy fix
`Cargo.toml`	Pin uuid to `<1.21` for MSRV compatibility
`.cargo/audit.toml`	Ignore known MSRV-pinned dependency advisories
`CONFIG.md`	Document sequential test execution
`utils/dashboard/README.md`	Note sequential test execution

Risk Assessment

Low risk. The core latency measurement logic (monotonic clock,
get_monotonic_time_ns()) is untouched. Only the wall-clock
metadata timestamp in streaming records is changed.
Backward compatible. No CLI changes. Streaming output JSON/CSV
schema is unchanged (same columns, same types). The internal
server-to-client latency file format changed from latency_ns to
wall_send_ns,latency_ns, but this file is ephemeral (created and
deleted within a single benchmark run) and never exposed to users.
All 334 tests pass with zero clippy warnings.
MSRV 1.70 verified via containerized pre-commit check.

dustinblack

I closed #92 because fresh testing showed the problem was no longer reproducible. Issue #106 was then created the following day as the basis for this PR, but it contains no evidence of an actual problem — no reproduction steps, no sample output showing incorrect timestamps, and no before/after data.

I'd suggest closing this PR and #106 unless concrete evidence can be provided in #106 demonstrating that the specific code paths being fixed here actually produce incorrect timestamps on the current main branch — similar to what I provided in #92.

mcurrier2 · 2026-03-24T16:23:59Z

These are valid issues/problems. I added to issue #106 showing the problems and before/after scenarios.

dustinblack

Thanks for adding the evidence to #106 — the before/after data clearly demonstrates the problem across round-trip, combined, and one-way code paths. The bugs are real and the fixes are technically sound. Withdrawing my earlier suggestion to close.

Technical assessment of the changes:

The fixes are correct. Moving SystemTime::now() capture from record-creation time to send time (via the send_timestamp_ns parameter on new()/new_combined()) is the right approach. The one-way server-side approximation (wall_now - latency) mixes clock domains but is clearly documented and is the best available without clock sync. The parse_latency_file_line() parser is clean, and the test coverage is thorough.

Before this can be approved, it needs:

Rebase on main. This branch is based on e826b20, missing PRs #103 and #104. The clippy refactors and .cargo/audit.toml are already on main and will drop out. CI failures should resolve.
Coordinate file format with PR #109. This PR changes the server latency file from latency_ns to wall_send_ns,latency_ns. PR #109's write_latency_buffer() writes the old single-field format. Whichever merges second needs to match. Since #109 is closer to approval, consider whether this PR should rebase on top of #109 or vice versa.
Remove documentation that overlaps with PR #108. The "Streaming Output Columns" and "Test Execution Order" README sections appear in both PRs. Whichever PR owns those sections, the other should drop them to avoid merge conflicts. Since those sections are most relevant here, I'd suggest keeping them in this PR and removing them from #108.

Remove sections that belong in the streaming-timestamps PR (#105), not this buffer-sizing PR: - "Streaming Output Columns" table and timestamp_ns accuracy note - "Test Execution Order" section - Streaming JSON/CSV description rewording - Round-trip CLI example comment expansion README diff now only contains buffer-sizing documentation changes: auto-sizing table, error prevention updates, and example buffer size correction. AI-assisted-by: Claude claude-4.6-opus-high-thinking Made-with: Cursor

…Hangs (#108) * fix: Use fixed 64KB buffer for standalone SHM to enable streaming Previously, standalone SHM mode calculated buffer size to fit ALL messages (e.g., 6.4MB for 50k messages). This caused the writer to dump everything instantly while the reader slowly drained using pthread_cond_timedwait with 500us polling timeouts, leading to huge accumulated latencies (~489ms for 64B). The fix uses a fixed 64KB buffer (or 2x message size if larger), matching container behavior. This enables proper streaming where the writer blocks when the buffer is full. Before: 489ms mean latency for 64B/50k messages After: 1.95ms mean latency for 64B/10k messages (blocking) 15.85ms mean latency for 64B/10k messages (async) Also updates test_transport_config_buffer_size_logic to expect the new fixed-buffer behavior for SHM while keeping the message-count sizing for TCP/UDS. Cherry-picked from container-to-container-ipc branch (3b49877). AI-assisted-by: Claude claude-4.6-opus-high-thinking (Anthropic) Made-with: Cursor * test/docs: Add buffer sizing and SHM backpressure tests, update documentation - Add 9 new tests covering SHM buffer sizing, backpressure, and condvar timed-wait behavior: - test_shm_large_message_buffer_sizing: verifies 2x msg size path when messages exceed 32KB (async) - test_shm_duration_mode_uses_fixed_buffer: verifies SHM gets 64KB in duration mode, not 1GB TCP/UDS default (async) - test_blocking_transport_config_buffer_size_logic: full buffer sizing test for blocking mode (SHM, TCP, PMQ, duration) - test_blocking_shm_duration_mode_uses_fixed_buffer: SHM 64KB in blocking duration mode - test_blocking_shm_large_message_buffer_sizing: 2x path (blocking) - test_backpressure_with_small_buffer: exercises timed condvar wait with 1KB buffer and 20 messages - test_payload_integrity_under_backpressure: byte-level payload verification through backpressure-induced blocking writes - test_ring_buffer_wrap_around_under_backpressure: write_pos wraps the circular buffer multiple times under backpressure - test_shutdown_detected_during_blocked_write: server closes while client is blocked waiting for buffer space - Update README.md Buffer Size Configuration with per-mechanism auto-sizing table and updated error prevention guidance - Update CONFIG.md SHM defaults from 8192 to 64KB (auto) and add automatic buffer sizing explanation - All tests passing, clippy clean AI-assisted-by: Claude claude-4.6-opus-high-thinking (Anthropic) Made-with: Cursor * test/docs: Add coverage tests and fix stale buffer sizing documentation - Add test_user_buffer_size_overrides_shm_default (async + blocking): verifies user-provided --buffer-size overrides SHM's 64KB default - Add test_shm_buffer_sizing_at_32kb_boundary (async + blocking): tests exact transition where 2*(msg_size+64) crosses 64KB threshold - Add test_high_volume_condvar_stress: 100 messages through 512-byte buffer to stress pthread_cond_timedwait retry loop - Update create_transport_config_internal doc comment in benchmark.rs to describe per-mechanism buffer sizing (SHM, PMQ, TCP/UDS) - Update Adaptive Buffer Sizing doc in benchmark_blocking.rs to describe per-mechanism behavior instead of vague description - Fix README.md example output: SharedMemory buffer size 10240000 -> 65536 to reflect new fixed 64KB auto-sizing - All 340 tests passing, zero clippy warnings AI-assisted-by: Claude claude-4.6-opus-high-thinking Made-with: Cursor * fix(msrv): Pin transitive dependencies to maintain Rust 1.70 compatibility The CI MSRV job removes Cargo.lock and resolves fresh dependencies. Several transitive dependencies recently bumped their MSRV above 1.70: - uuid 1.21+ requires Rust 1.85 → pinned to <1.21 - tempfile 3.25+ pulls getrandom >=0.3,<0.5 which resolves to 0.4.x (edition 2024, unparseable by Rust 1.70's cargo) → pinned to <3.25 - zmij 1.0.20+ requires Rust 1.71 → pinned to =1.0.19 - quote 1.0.45+ requires Rust 1.71 → pinned to =1.0.44 - syn 2.0.115+ requires Rust 1.71 → pinned to =2.0.114 - unicode-ident 1.0.23+ requires Rust 1.71 → pinned to =1.0.22 Verified: MSRV builds and tests pass both with and without Cargo.lock in a Rust 1.70 container. Local clippy, fmt, and tests all clean. AI-assisted-by: Claude claude-4.6-opus-high-thinking Made-with: Cursor * ci: trigger CI rebuild with updated MSRV dependency pins No code changes. Forces new CI run to pick up dependency pins from commit cd28295 (uuid <1.21, tempfile <3.25, zmij =1.0.19, quote =1.0.44, syn =2.0.114, unicode-ident =1.0.22). AI-assisted-by: Claude claude-4.6-opus-high-thinking Made-with: Cursor * fix: Remove out-of-scope condvar/polling code per review Remove container-IPC code that was out of scope for issue #107 (buffer sizing fix). This scopes the PR to items 4-5 only. Removed: - write_data_polling() and read_data_polling() fallback functions - pthread_cond_timedwait (reverted to pthread_cond_wait) - Broken-condvar detection (100-iteration/10ms heuristic) - Mutex-lock-failure fallbacks to polling paths - 30-second timeout counters (wait_count > 60000) - test_high_volume_condvar_stress test Restored from main (PR #104): - write_data_blocking() signature with timestamp_offset parameter so latency measurement excludes backpressure wait time - read_data_blocking() with clean pthread_cond_wait - Proper pthread_cond_signal in both write and read paths The two functional regressions cited in review are resolved: 1. Timestamp regression: write_data_polling() lacked timestamp_offset, but that function no longer exists. The only write path now refreshes the timestamp after the condvar wait. 2. Missing condvar signal: write_data_polling() never called pthread_cond_signal(&data_ready), but that function no longer exists. The only write path signals after every write. All tests passing (42/42). No clippy warnings. AI-assisted-by: Claude claude-4.6-opus-high-thinking Made-with: Cursor * docs: Remove out-of-scope README sections per review Remove sections that belong in the streaming-timestamps PR (#105), not this buffer-sizing PR: - "Streaming Output Columns" table and timestamp_ns accuracy note - "Test Execution Order" section - Streaming JSON/CSV description rewording - Round-trip CLI example comment expansion README diff now only contains buffer-sizing documentation changes: auto-sizing table, error prevention updates, and example buffer size correction. AI-assisted-by: Claude claude-4.6-opus-high-thinking Made-with: Cursor * fix: Unify message overhead constant across buffer sizing paths Replace hardcoded 64 and 32 values with the MESSAGE_OVERHEAD constant in both benchmark.rs and benchmark_blocking.rs: - TCP/UDS msg-count sizing: was hardcoded 64, now MESSAGE_OVERHEAD - SHM logging/validation: was hardcoded 32, now MESSAGE_OVERHEAD - Add comment explaining what MESSAGE_OVERHEAD covers: 8 (id) + 8 (timestamp) + 8 (bincode vec length) + 1 (message type) + 4 (ring buffer length prefix) = 29 bytes, rounded to 64 Addresses review feedback about inconsistent overhead values. AI-assisted-by: Claude claude-4.6-opus-high-thinking Made-with: Cursor

…ages Pass send timestamp to MessageLatencyRecord instead of capturing it when the record is created. This makes the timestamp represent when the message was sent, so gaps between timestamps now match the actual latency. Changes: - MessageLatencyRecord::new() and new_combined() now require send_timestamp_ns parameter instead of capturing SystemTime::now() internally - Added MessageLatencyRecord::current_timestamp_ns() helper for call sites that need to capture the current wall-clock time - Updated all streaming record creation sites in benchmark.rs and benchmark_blocking.rs to capture and pass send timestamps Cherry-picked from container-to-container-ipc branch (6cb8f9b). Host-container-specific changes excluded (host_container.rs not present). AI-assisted-by: Claude claude-4.6-opus-high-thinking (Anthropic) Made-with: Cursor

…n documentation - Add 3 new unit tests for MessageLatencyRecord timestamp handling: test_new_uses_provided_send_timestamp, test_new_combined_uses_provided_send_timestamp, test_current_timestamp_ns_returns_recent_value - Add 2 end-to-end streaming tests in benchmark.rs: test_one_way_streaming_captures_send_timestamp, test_round_trip_streaming_captures_send_timestamp - Document sequential one-way/round-trip test execution in README.md (new "Test Execution Order" section), CONFIG.md, dashboard README, and run() doc comments in benchmark.rs and benchmark_blocking.rs - All 332 tests pass (331 passed, 1 ignored), zero clippy warnings AI-assisted-by: Claude claude-4.6-opus-high-thinking Made-with: Cursor

…aths The previous commit (68afe07) added a send_timestamp_ns parameter to MessageLatencyRecord but only the blocking round-trip path was actually capturing timestamps at send time. The async round-trip, async combined, and all one-way paths were still using current_timestamp_ns() at record-creation time (post-test), causing all timestamps within a run to cluster into the same second. Changes: - Async round-trip: capture wall-clock timestamp inside the spawned client future before each send(), return Vec<(Duration, u64)> - Async combined: same pattern for one-way latency vector - One-way (async + blocking): server now writes "wall_send_ns,latency_ns" per line (wall_send_ns = wall_clock_now - latency); client readers parse and use the server-computed send timestamp - Add parse_latency_file_line() with 7 unit tests covering valid input, missing commas, empty lines, non-numeric values, and extra commas - Enhance existing end-to-end streaming tests to validate timestamps fall within the test execution window and are not all identical - Document streaming output column definitions in README.md including timestamp_ns semantics and accuracy note for one-way clock mixing All 265+ unit tests pass, clippy clean, no scope creep. AI-assisted-by: Claude claude-4.6-opus-high-thinking (Anthropic) Made-with: Cursor

github-actions · 2026-04-07T16:07:51Z

⚠️ **ERROR:** Code formatting issues detected. Please run `cargo fmt --all` locally and commit the changes.

mcurrier2 · 2026-04-07T16:44:55Z

Rebase on main

Done. Rebased onto current origin/main which includes PRs #103, #104, #108, and #109. The three commits that were already upstream (clippy refactors, uuid pin, audit.toml) dropped out automatically during rebase. CI failures from those should be resolved. Branch is now 3 commits ahead of main.

Coordinate file format with PR perf/server-latency-buffering : Buffer latencies in memory instead of per-message file I/O #109

Resolved during rebase. PR #109 introduced write_latency_buffer() which buffered latencies in memory as Vec and wrote single-field lines (latency_ns). During rebase I updated the buffer type to Vec<(u64, u64)> storing (wall_send_ns, latency_ns) pairs, and updated write_latency_buffer() to write the two-field "wall_send_ns,latency_ns" format that parse_latency_file_line() expects. The should_buffer_latency() helper is preserved for canary message filtering. All existing latency buffer tests were updated to use the two-field format, including the round-trip parse test which now uses parse_latency_file_line() directly to verify end-to-end compatibility. All 42 tests pass.

Remove documentation that overlaps with PR Fix/buffer size overloading - Fix SHM Buffer Over-Sizing and Condvar Hangs #108

Already handled. The "Streaming Output Columns" and "Test Execution Order" README sections were removed from PR #108 per reviewer feedback before it was merged. Those sections now exist only in this PR, which is where they belong. Verified after rebase — no duplication in the diff against main.

sberg-rh · 2026-04-07T18:31:05Z

A couple of minor observations — not blocking

These are both minor and probably fine as-is, but wanted to flag them in case they're worth a quick look.

1. Wall-clock timestamp captured slightly late on server side (src/main.rs:699-703, 904-908)

wall_now_ns is captured after the should_buffer_latency check rather than immediately after receive_time_ns. Since the send timestamp is back-computed as wall_now - latency, any delay between capturing receive_time_ns and wall_now_ns introduces a small drift. Would it be more precise to capture wall_now_ns right next to receive_time_ns, before the if branch?

Likely negligible for most benchmark scenarios, but curious if this was intentional or just ordering convenience.

2. parse_latency_file_line may not handle Windows line endings (src/benchmark.rs:258)

splitn(2, ',') on a line like "100,200\r" would try to parse "200\r" as a u64, which would fail with a confusing error. Would adding a .trim() on the input line before splitting be worth it for cross-platform robustness? Might not matter if latency files are always generated and consumed on the same system.

- Move SystemTime::now() capture to immediately after get_monotonic_time_ns() in both blocking and async server paths, eliminating minor drift from the should_buffer_latency branch - Add .trim() to parse_latency_file_line() input for cross-platform robustness against Windows \r\n line endings - All tests passing AI-assisted-by: Claude claude-4.6-opus-high-thinking Made-with: Cursor

github-actions · 2026-04-10T15:12:45Z

📈 Changed lines coverage: 88.89% (56/63)

🚨 Uncovered lines in this PR

src/benchmark.rs: 1549, 1557
src/benchmark_blocking.rs: 1143, 1233-1234, 1254, 1306

📊 Code Coverage Summary

File	Line Coverage	Uncovered Lines
`src/benchmark.rs`	83.64% (506/605)	`75, 78, 89, 93, 102, 105, 107, 124, 422, 427-432, 439-444, 511-514, 619, 703, 709-711, 715-717, 737, 806-808, 813, 834, 839, 857, 963, 967-970, 972, 981-984, 986, 1062, 1093, 1096, 1098-1099, 1108-1109, 1251, 1264, 1281, 1404, 1413, 1415-1416, 1419-1420, 1426-1427, 1432-1433, 1435, 1440-1441, 1445-1447, 1452-1453, 1456-1457, 1489, 1547-1551, 1553-1559, 1561, 1564, 1721, 1736`
`src/benchmark_blocking.rs`	73.50% (319/434)	`97, 111, 127, 263, 369, 375-377, 380-382, 402, 434, 488, 587, 600, 614, 644-647, 732-735, 754, 758, 773, 815-817, 820, 823-825, 827, 830, 832-836, 838-839, 847-851, 853-857, 860-861, 865-866, 901, 950, 1029, 1040, 1070, 1073, 1138-1143, 1145, 1200-1203, 1208, 1221, 1224-1227, 1231, 1233-1236, 1238, 1240-1241, 1243-1244, 1247, 1249-1254, 1256, 1260-1261, 1263, 1265, 1289, 1301-1306, 1308, 1328-1331`
`src/cli.rs`	92.39% (85/92)	`630, 729, 769, 771, 792-794`
`src/execution_mode.rs`	100.00% (14/14)	``
`src/ipc/mod.rs`	65.28% (47/72)	`115, 425, 427-430, 740-741, 756-757, 775-776, 807, 810, 813, 818, 845-846, 860, 862, 882, 884, 1007-1009`
`src/ipc/posix_message_queue.rs`	46.09% (59/128)	`139-140, 213-215, 217, 224, 229, 332-335, 337, 345, 437, 441-442, 446, 449-452, 454-458, 539, 679, 782, 789-790, 807-808, 819-820, 831-832, 849-850, 906, 910-911, 914-919, 921-923, 927, 929-931, 933, 935-937, 941-943, 945-947, 994-995, 1017`
`src/ipc/posix_message_queue_blocking.rs`	81.94% (127/155)	`172, 182, 221, 251-255, 274, 325, 368, 387-390, 416-418, 422-423, 425-426, 436, 455, 457-458, 460-461`
`src/ipc/shared_memory.rs`	69.36% (163/235)	`61, 141, 145, 246-247, 257-258, 262, 390-391, 417-419, 421, 439-441, 443-444, 446-450, 467, 474, 480, 483-484, 488, 492, 496-497, 502-503, 666-667, 670-671, 674, 676, 681-682, 709-710, 713-714, 721-723, 725, 727-732, 734-735, 738-739, 741-745, 752, 782, 784-785, 787, 791`
`src/ipc/shared_memory_blocking.rs`	78.87% (209/265)	`196-198, 200-201, 204-206, 209-210, 212, 217, 219, 223-225, 230, 238-240, 243-245, 248-249, 251, 254, 257-258, 261-262, 266-267, 269, 273-274, 276, 311-312, 378-379, 403-407, 498, 506, 556, 573, 660, 726, 789, 798, 808, 830`
`src/ipc/shared_memory_direct.rs`	83.80% (150/179)	`372-375, 444-451, 455, 482, 506-509, 513-514, 556-557, 569, 598, 605-606, 629-630, 636`
`src/ipc/tcp_socket.rs`	59.43% (63/106)	`31-32, 61, 96, 113-114, 118, 124-125, 129, 136-137, 141, 147-148, 152, 171-172, 175-177, 184-185, 188, 362-363, 366-367, 370-371, 376-377, 422, 429, 447-449, 478, 480-482, 484, 487`
`src/ipc/tcp_socket_blocking.rs`	97.62% (82/84)	`134, 159`
`src/ipc/unix_domain_socket.rs`	59.43% (63/106)	`29-30, 58, 93, 103, 122-123, 127, 133-134, 138, 145-146, 150, 156-157, 161, 180-181, 184-186, 193-194, 197, 346-347, 350-351, 354-355, 360-361, 412-414, 443, 445-447, 449, 452, 468`
`src/ipc/unix_domain_socket_blocking.rs`	94.34% (100/106)	`276-277, 283-285, 287`
`src/logging.rs`	100.00% (13/13)	``
`src/main.rs`	46.11% (166/360)	84-86, 88, 125-126, 136-140, 144-146, 148-149, 151-152, 172-175, 199-203, 211, 217, 220, 225-228, 233-234, 240, 246, 248-250, 252, 258-259, 265, 270, 273-274, 278, 280-281, 285-286, 288, 294, 298-299, 301-306, 308-309, 312, 321, 324-325, 328, 375-378, 385, 387-391, 394-397, 399-400, 402-403, 405, 407-413, 417, 419-422, 425, 429-431, 435, 437, 440, 444, 449-452, 458-459, 465-466, 472, 474-475, 479, 481, 486-488, 492, 495-496, 498-499, 504, 506-508, 512-513, 515, 522, 527-528, 530-535, 537-538, 542, 551, 554-555, 558, 560, 579, 586, 590-592, 594, 624-625, 633, 666, 717, 721, 724-727, 783-786, 823-824, 831-832, 835, 862-863, 866, 918-919, 923-926, 948, 975, 984, 989, 994-995
`src/metrics.rs`	79.79% (150/188)	`455-460, 493-494, 552, 558, 579-582, 732-734, 736, 768, 788, 833, 838, 881, 904, 923-924, 926-927, 930-932, 952, 980, 984, 1005, 1007-1008, 1013`
`src/results.rs`	56.63% (252/445)	726, 735-737, 739-740, 743-744, 747, 769, 772-773, 776, 778, 781, 785-790, 800-801, 804-809, 826, 838-839, 841, 843, 846-847, 849, 853, 880, 904-906, 909-910, 914-916, 919, 945, 950, 955, 961, 980, 982-983, 985, 987-991, 993, 995-996, 1030, 1071-1072, 1075, 1081-1082, 1086, 1090-1092, 1094-1095, 1119-1123, 1126-1129, 1132-1139, 1149-1150, 1169-1170, 1172-1176, 1178, 1195-1196, 1198-1203, 1205, 1223, 1225-1230, 1248, 1251, 1267-1268, 1283-1285, 1287-1289, 1291-1292, 1294-1295, 1297-1298, 1300, 1302-1303, 1305-1308, 1310-1312, 1314-1316, 1319, 1323-1324, 1332-1337, 1339-1340, 1344-1345, 1349-1351, 1353, 1357-1358, 1367-1370, 1374-1376, 1380, 1382, 1385, 1390-1391, 1396, 1403-1407, 1409, 1607-1608, 1828-1829, 1831-1832, 1837
`src/results_blocking.rs`	95.48% (296/310)	`489-490, 492-493, 544, 769, 774, 779, 815, 818-819, 827-828, 886`
`src/utils.rs`	70.73% (29/41)	`71, 143, 147-149, 153, 159, 198-202`
Total	73.46% (2893/3938)

mcurrier2 · 2026-04-10T15:14:02Z

Accepted both suggestions Shawn.

Fix 1 — Wall-clock timestamp captured immediately (blocking path, same pattern in async):

Before:

let receive_time_ns = get_monotonic_time_ns();
let latency_ns = receive_time_ns.saturating_sub(message.timestamp);
if should_buffer_latency(latency_file_path.is_some(), message.id) {
let wall_now_ns = SystemTime::now() // <-- captured LATE
.duration_since(UNIX_EPOCH)
.unwrap_or_default()
.as_nanos() as u64;
let wall_send_ns = wall_now_ns.saturating_sub(latency_ns);
latency_buffer.push((wall_send_ns, latency_ns));
}
After:

let receive_time_ns = get_monotonic_time_ns();
let wall_now_ns = SystemTime::now() // <-- captured IMMEDIATELY
.duration_since(UNIX_EPOCH)
.unwrap_or_default()
.as_nanos() as u64;
let latency_ns = receive_time_ns.saturating_sub(message.timestamp);
if should_buffer_latency(latency_file_path.is_some(), message.id) {
let wall_send_ns = wall_now_ns.saturating_sub(latency_ns);
latency_buffer.push((wall_send_ns, latency_ns));
}

Fix 2 — Trim input for Windows line endings:

Before:

pub fn parse_latency_file_line(line: &str) -> anyhow::Result<(u64, u64)> {
let parts: Vec<&str> = line.splitn(2, ',').collect();

After:

pub fn parse_latency_file_line(line: &str) -> anyhow::Result<(u64, u64)> {
let line = line.trim();
let parts: Vec<&str> = line.splitn(2, ',').collect();

The remaining diff lines are cargo fmt reformatting (collapsing multi-line expressions that fit on one line).

dustinblack

All previous review items addressed: rebased on main (including merged #108 and #109), file format cleanly integrated with #109's buffering, documentation overlap resolved, and sberg-rh's feedback incorporated.

CI is fully green. Approved.

sberg-rh

LGTM!

PR #105 (Fix/streaming timestamps) added send_timestamp_ns parameter to MessageLatencyRecord::new(). Update all 4 call sites in standalone client to capture wall-clock timestamp at send time and pass it as the 6th argument. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(cli): Add standalone --server and --client modes with full reporting Add the ability to run the benchmark server and client as independent processes, enabling cross-environment IPC testing (e.g., host and container). Relates to #11. Standalone mode features: - --server flag starts a server that listens for client connections - --client flag connects to a running server with retry logic (100ms backoff, 30s timeout) - Both async (Tokio) and blocking (std) execution modes supported - Duration (-d) and message-count (-i) modes both supported - Default transport endpoints work without extra flags - Endpoint flags (--socket-path, --shared-memory-name, --message-queue-name) promoted to user-facing Reporting integration: - Full ResultsManager/MetricsCollector integration for structured output (JSON, streaming CSV, console summary with HDR percentiles) - Server-side one-way latency measurement using monotonic clock (accurate for same-host and container scenarios) - Round-trip latency with per-message streaming support Code quality: - Shared helpers: dispatch_server_message(), retry constants - 25 tests covering CLI parsing, transport config, server dispatch, connection retry, shutdown, duration mode, one-way, round-trip - Explicit MessageType::Shutdown on client disconnect Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Support send_delay and include_first_message in standalone client - Add --send-delay support: inserts a configurable pause after each message send (blocking uses thread::sleep, async uses tokio::sleep) - Add --include-first-message support: when false (default), sends a canary message before measurement to warm up the connection, matching the existing BenchmarkRunner behavior - Applied to both one-way and round-trip tests in both blocking and async client paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * perf: Eliminate per-message heap allocation in standalone client Reuse a single Message struct across loop iterations instead of calling Message::new() with payload.clone() on every send. The message id and timestamp are updated in-place before each send. This removes one Vec<u8> heap allocation per message in the measurement loop, reducing allocation overhead that can skew latency results, especially for small messages. Applied to both one-way and round-trip tests in both blocking and async client paths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Add concurrency support for standalone client/server Server-side multi-accept: - TCP and UDS servers now accept multiple concurrent connections, spawning a handler thread per client with its own MetricsCollector - Grace period after first client prevents premature server exit - SHM and PMQ fall back to single-client mode with a warning - Server aggregates one-way latency metrics across all handlers Client-side multi-threaded execution: - Blocking client spawns N worker threads, each with its own transport connection, MetricsCollector, and message loop - Async client uses tokio::task::JoinSet for concurrent workers - Results aggregated via MetricsCollector::aggregate_worker_metrics() - Per-message streaming disabled for concurrent mode (aggregated only) Transport additions: - BlockingTcpSocket::from_stream() wraps pre-accepted TcpStream - BlockingUnixDomainSocket::from_stream() wraps pre-accepted UnixStream Shared helpers: - handle_client_connection() -- per-client message dispatch and metrics - aggregate_and_print_server_metrics() -- shared aggregation logic Tests: - test_standalone_concurrent_tcp_round_trip (3 concurrent clients) - test_handle_client_connection_round_trip (dispatch correctness) - test_handle_client_connection_one_way_metrics (metrics recording) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: Add coverage for concurrency, from_stream, and aggregation - test_standalone_concurrent_tcp_one_way: multi-accept server with 2 concurrent one-way clients, verifying server-side metrics recording - test_tcp_from_stream_send_receive: BlockingTcpSocket::from_stream() full send/receive round-trip - test_uds_from_stream_send_receive: BlockingUnixDomainSocket::from_stream() full send/receive round-trip (unix-only) - test_concurrency_forced_to_one_for_shm: CLI parsing for SHM with concurrency > 1 - test_aggregate_and_print_empty_collectors: empty input edge case - test_aggregate_and_print_single_collector: single collector with data Total binary tests: 34. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Add async multi-accept server and remove unused parameter - Add multi-accept support for async TCP and UDS servers, matching the blocking server's concurrency support. Uses tokio::net listeners with spawn_blocking for per-client handler threads. - Remove unused _args parameter from run_standalone_server_async - Replace inline latency printing in async server with shared print_server_one_way_latency helper Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Use OS-assigned ports in tests and document grace period - Replace all 12 hardcoded test ports (18301-18314) with OS-assigned ports via get_free_port() helper (binds to port 0, extracts assigned port). Prevents port conflicts in parallel test runs and with other processes. - Extract 2-second multi-accept grace period into SERVER_ACCEPT_GRACE_PERIOD constant with documentation explaining the behavior and limitation. - Document the grace period in --server CLI help text so users know concurrent clients should connect within 2 seconds of each other. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Set streams to blocking mode after tokio into_std conversion tokio::net::TcpStream::into_std() leaves the stream in non-blocking mode (set by tokio for epoll/kqueue). The blocking transport's read_exact/write_all calls then fail with WouldBlock errors, causing immediate disconnection. Fix: call set_nonblocking(false) on streams after into_std() in both TCP and UDS async multi-accept servers. Add test_standalone_async_concurrent_tcp_round_trip to exercise the async multi-accept path (tokio accept + spawn_blocking + from_stream + handle_client_connection). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: Strengthen test assertions for real-world correctness - test_standalone_blocking_tcp_one_way: verify server received exact message count with correct sequential IDs, add shutdown message - test_standalone_blocking_tcp_duration_round_trip: verify response IDs match requests, assert count > 10 for 200ms test, add shutdown - test_standalone_blocking_tcp_duration_one_way: verify server received exact count with sequential IDs, assert count > 10 for 200ms test - test_concurrency_forced_to_one_for_shm: test actual concurrency forcing logic instead of just CLI parsing - test_standalone_concurrent_tcp_one_way: assert exact message count per handler instead of just "greater than zero" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Address round 2 review feedback - Clean up garbled doc comment on async concurrent test (editing artifacts from multiple rewrites) - Replace silent panic swallowing in async multi-accept servers: try_join_next().transpose() silently dropped JoinErrors from panicked handler tasks. Now logs warnings via warn!(). - Extract effective_concurrency() helper to deduplicate the concurrency-forcing logic (was copied in blocking client, async client, and test). Test now calls the actual helper instead of reimplementing the logic inline. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: Add real-world scenario tests for coverage and correctness - test_standalone_large_payload_integrity: 4KB payloads with recognizable byte pattern, server echoes back, client verifies content byte-for-byte to catch corruption - test_handle_client_connection_filters_canary: verifies warmup canary messages (id=u64::MAX) are excluded from one-way metrics - test_handle_client_connection_mixed_message_types: interleaved OneWay and Request messages on a single connection, verifies correct metrics recording and response dispatch - test_aggregate_and_print_multiple_collectors: aggregation across 2 collectors with different latency distributions - test_effective_concurrency_all_mechanisms: covers UDS, PMQ, SHM, TCP, and concurrency=1 edge case Total binary tests: 40. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Set accepted streams to blocking mode in multi-accept servers Accepted TCP/UDS streams inherit non-blocking mode from the listener (set for the accept poll loop). The handler threads need blocking mode for the transport's read_exact/write_all operations. This is the blocking-server equivalent of the async into_std fix in commit 8723429. Without this fix, standalone server handlers immediately disconnect from clients. Applies to both run_standalone_server_blocking_multi_accept_tcp and _uds. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Address review feedback - grace period, quiet flag, mutex safety - Fix grace period bug: reset timer on every new connection, not just the first. Prevents premature server exit between one-way and round-trip test phases when using concurrency > 1. Applied to all four multi-accept servers (blocking TCP/UDS, async TCP/UDS). - Honor --quiet flag in standalone server and client. When set, suppress all tracing output to stderr. - Handle poisoned mutex gracefully: use unwrap_or_else(|e| e.into_inner()) instead of unwrap() on mutex locks. If a handler thread panics while holding the lock, other threads can still push their metrics instead of cascade-panicking. - Add defensive --shm-direct guard in standalone server and client: returns error if --shm-direct is used without --blocking. This is normally enforced by main() but the guard protects against future refactoring that might change the dispatch order. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Address review feedback - multiple correctness and quality fixes - Fix grace period bug: reset timer on every new connection, not just the first. Prevents premature server exit between one-way and round-trip test phases when using concurrency > 1. Applied to all four multi-accept servers (blocking TCP/UDS, async TCP/UDS). - Honor --quiet flag in standalone server and client. When set, suppress all tracing output to stderr. - Handle poisoned mutex gracefully: use unwrap_or_else(|e| e.into_inner()) instead of unwrap() on mutex locks in handler threads. - Add defensive --shm-direct guard in standalone server and client. - Add socket buffer tuning (recv/send buffer sizes) to multi-accept TCP servers to match normal transport behavior. - Fix integer division remainder: last worker now receives any extra messages when msg_count is not evenly divisible by concurrency. - Document empty response payloads as intentional design matching existing benchmark runner behavior. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: Add receive_blocking_timed for accurate one-way latency Add receive_blocking_timed() to the BlockingTransport trait that captures a monotonic timestamp after raw bytes are read but before bincode deserialization. This excludes deserialization overhead from one-way latency measurements. - Add default implementation on BlockingTransport trait (backward compatible -- captures timestamp after full receive) - Override in TCP, UDS, and SHM blocking transports to place timestamp between raw I/O read and deserialization - SHM-direct uses default (no bincode deserialization to exclude) - Update handle_client_connection and standalone single-client server to use receive_blocking_timed Impact is most visible with large payloads where deserialization is non-trivial. 64KB one-way TCP test shows min latency dropped from 41us (post-deserialize) to 14us (pre-deserialize), a ~27us improvement representing the bincode deserialization time excluded from measurement. Mean dropped 28% (73us to 52us) and P99 dropped 14% (132us to 113us). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Update MessageLatencyRecord calls for PR #105 compatibility PR #105 (Fix/streaming timestamps) added send_timestamp_ns parameter to MessageLatencyRecord::new(). Update all 4 call sites in standalone client to capture wall-clock timestamp at send time and pass it as the 6th argument. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * refactor: Move standalone logic from main.rs to library crate Move ~3100 lines of standalone client/server code from the binary crate (main.rs) into two new library modules, following the existing flat-file convention (benchmark.rs/benchmark_blocking.rs pattern). Structure: - standalone_server.rs (1982 lines): constants, shared helpers, server dispatch, multi-accept TCP/UDS, async server paths - standalone_client.rs (1146 lines): retry helpers, client dispatch, single/concurrent blocking and async paths - main.rs reduced from ~4200 to ~1120 lines (thin dispatch layer) Additional changes: - Promote logging.rs from binary-private to library-public module - Move set_affinity() to utils.rs as pub function - All standalone functions now pub for tarpaulin coverage measurement and integration test access No behavioral changes. All 374 tests pass. Benchmark comparison across 3 runs confirms no performance regression (mean latencies within 2-5% run-to-run variance). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Improve server resilience and error visibility - Socket configuration failures (set_nonblocking, set_nodelay) in multi-accept servers now log a warning and skip the bad connection instead of crashing the entire server with ? - Thread join panics in blocking multi-accept servers now logged with warn! instead of silently dropped with let _ = - Streaming latency record failures in client now logged with debug! instead of silently swallowed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test: Comprehensive coverage improvements for standalone modules Server tests (standalone_server.rs, 82.7% coverage): - test_multi_accept_tcp_server_direct: exercises multi-accept TCP directly - test_single_server_direct: exercises blocking single-client directly - test_server_blocking_dispatch: exercises dispatch logic - test_server_blocking_dispatch_uds: exercises UDS dispatch branch - test_async_multi_accept_tcp_full: exercises async multi-accept TCP - test_async_single_server_path: exercises async single-client - test_async_single_server_one_way_metrics: async one-way metrics - test_async_multi_accept_uds_full: exercises async UDS multi-accept - test_multi_accept_server_with_delayed_client: slow sender resilience - test_multi_accept_server_duration_one_way: duration mode with multi-accept - test_async_multi_accept_server_duration_one_way: async duration mode - test_handle_client_connection_send_failure: client disconnect error path - test_single_server_client_disconnect: single server send error path - test_multi_accept_server_survives_bad_client: garbage input resilience - test_handle_client_connection_garbage_input: deserialization error path - test_run_standalone_server_full_dispatch: full entry point dispatch - test_run_standalone_server_rejects_all_via_dispatch: 'all' validation - test_run_standalone_server_rejects_shm_direct: shm-direct guard - test_run_standalone_server_verbose: -vv logging level branches - test_aggregate_server_metrics_from_handlers: real handler data - test_print_server_one_way_latency_with_data/zero: print paths Client tests (standalone_client.rs, 86.3% coverage): - test_client_blocking_tcp_round_trip/one_way: single client paths - test_client_blocking_tcp_duration_round_trip/one_way: duration mode - test_client_blocking_tcp_concurrent_round_trip/one_way: concurrent - test_client_async_single_round_trip/one_way: async single - test_client_async_duration_round_trip/one_way: async duration - test_client_async_concurrent_round_trip/one_way: async concurrent - test_client_blocking_with_send_delay: send_delay round-trip branch - test_client_blocking_one_way_with_send_delay: send_delay one-way branch - test_client_blocking_with_streaming_output: JSON streaming - test_client_blocking_combined_streaming: combined mode streaming - test_client_blocking_csv_streaming: CSV streaming - test_client_blocking_concurrent_duration_one_way: concurrent duration - test_client_async_concurrent_duration_one_way: async concurrent duration - test_run_standalone_client_full_dispatch: full entry point dispatch - test_run_standalone_client_rejects_all_via_dispatch: 'all' validation - test_run_standalone_client_rejects_shm_direct: shm-direct guard - test_connect_async_with_retry_succeeds: async retry path Also: changed tracing .init() to .try_init() with eprintln fallback in both server and client for test compatibility. Coverage: standalone_server 82.7%, standalone_client 86.3%, combined 84.8% Total lib tests: 355. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * perf: Reduce multi-accept server polling interval from 50ms to 5ms Reduce the non-blocking accept loop sleep from 50ms to 5ms in both TCP and UDS multi-accept servers. This cuts connection acceptance latency by 10x with no portability concerns. Discovered during hands-on validation testing of standalone concurrent mode, where the 50ms polling interval was the primary contributor to elevated tail latency under multi-client workloads. Improvement with -c 4 concurrent clients: - RT P95: -46% (65.9us -> 35.5us) - RT P99: -49% (91.4us -> 46.9us) - Throughput: +66% (94.9 -> 157.1 MB/s) Single-client workloads also benefit from faster initial connection acceptance (P99 improved 4-7% across all test modes). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Address PR #110 review feedback (issues 1-11) - Add 60s idle timeout to prevent server hang when no client connects - Log canary/shutdown send errors (warn for canary, debug for shutdown) - Wire concurrent streaming: workers collect MessageLatencyRecord and stream through ResultsManager after completion - Reject multiple mechanisms in standalone mode with clear error - Add debug logging on server receive errors for diagnostics - Add per-worker warmup in concurrent mode (fixes metadata mismatch) - Document two-phase reconnection behavior in concurrent mode - Add SIGINT/SIGTERM signal handler via ctrlc crate for graceful server shutdown and resource cleanup - Add receive_blocking_timed override for PMQ transport to exclude deserialization from one-way latency measurement - Add 11 integration tests spawning separate server/client processes - Fix misleading test comment on garbage input handler Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: CI compatibility for MSRV and lint checks - Drop ctrlc "termination" feature (requires Rust 1.75+ due to nix static zeroed); SIGINT still handled, SIGTERM uses default OS behavior - Remove unused BlockingTcpSocket import in canary failure test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Pin ctrlc to 3.4.5 for MSRV 1.70 compatibility ctrlc 3.5+ uses std::mem::zeroed() in statics which requires Rust 1.75+. Version 3.4.5 does not have this issue and supports Rust 1.69+. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Use cross-platform mechanisms in multiple-mechanism rejection tests Tests used "tcp uds" but UDS is not a valid mechanism on Windows, causing clap to reject the args before reaching our validation code. Changed to "tcp shm" which exists on all platforms. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Use platform-independent temp dir in output file integration test The test hardcoded /tmp/ which doesn't exist on Windows. Use std::env::temp_dir() for cross-platform compatibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Address PR #110 follow-up review (4 additional findings) - Add sentinel connection in concurrent client to prevent server from exiting between one-way and round-trip phases (keeps a connection open across both phases so the server's grace period check never triggers prematurely) - Surface accept-loop errors: upgrade from debug to warn logging and return Err when accept fails before any client connects - Fix duration precision loss: use as_secs_f64() instead of as_secs() when passing duration to spawned server process - Propagate latency file write errors instead of silently discarding - Add integration tests for concurrent both-tests scenario Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * style: Fix formatting issues caught by CI Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

mcurrier2 requested review from dustinblack and sberg-rh March 11, 2026 13:52

mcurrier2 linked an issue Mar 11, 2026 that may be closed by this pull request

Bug: Timestamps in streaming output are incorrect, showing the same second for all messages #92

Closed

mcurrier2 removed a link to an issue Mar 13, 2026

Bug: Timestamps in streaming output are incorrect, showing the same second for all messages #92

Closed

mcurrier2 linked an issue Mar 13, 2026 that may be closed by this pull request

Fix streaming output timestamps captured at record-creation time instead of message-send time #106

Closed

This was referenced Mar 17, 2026

Fix/buffer size overloading - Fix SHM Buffer Over-Sizing and Condvar Hangs #108

Merged

perf/server-latency-buffering : Buffer latencies in memory instead of per-message file I/O #109

Merged

dustinblack reviewed Mar 23, 2026

View reviewed changes

dustinblack reviewed Mar 25, 2026

View reviewed changes

mcurrier2 added 3 commits April 7, 2026 12:04

mcurrier2 force-pushed the fix/streaming-timestamps branch from dfbf5c5 to c3c16a3 Compare April 7, 2026 16:07

mcurrier2 requested review from dustinblack, ewchong and sberg-rh and removed request for sberg-rh April 7, 2026 16:45

dustinblack approved these changes Apr 13, 2026

View reviewed changes

dustinblack enabled auto-merge (squash) April 13, 2026 11:57

sberg-rh approved these changes Apr 13, 2026

View reviewed changes

dustinblack merged commit 2bbcd52 into main Apr 13, 2026
12 checks passed

dustinblack deleted the fix/streaming-timestamps branch April 13, 2026 13:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix/streaming timestamps#105

Fix/streaming timestamps#105
dustinblack merged 4 commits into
mainfrom
fix/streaming-timestamps

mcurrier2 commented Mar 11, 2026 •

edited

Loading

Uh oh!

dustinblack left a comment

Uh oh!

mcurrier2 commented Mar 24, 2026

Uh oh!

dustinblack left a comment

Uh oh!

github-actions Bot commented Apr 7, 2026

Uh oh!

mcurrier2 commented Apr 7, 2026

Uh oh!

sberg-rh commented Apr 7, 2026

Uh oh!

github-actions Bot commented Apr 10, 2026

Uh oh!

mcurrier2 commented Apr 10, 2026 •

edited

Loading

Uh oh!

dustinblack left a comment

Uh oh!

sberg-rh left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

mcurrier2 commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Testing

Checklist

PR: Fix Streaming Output Timestamps Across All Code Paths

Summary

Root Cause

Problems and Fixes

1. Async round-trip timestamps captured post-test

2. Async combined timestamps captured post-test

3. One-way timestamps from file-read time instead of send time

Known Limitations

Wall-clock / monotonic clock mixing in one-way path

Timestamp capture ordering

Tests Added

Unit tests for parse_latency_file_line (7 tests)

Enhanced end-to-end streaming tests (2 tests)

Pre-existing timestamp API tests (3 tests in results.rs)

Documentation Added

README.md: Streaming Output Columns

Validation

Unit and Integration Tests

Functional Verification

Before/After: Timestamp Distribution

Files Changed

Risk Assessment

Uh oh!

dustinblack left a comment

Choose a reason for hiding this comment

Uh oh!

mcurrier2 commented Mar 24, 2026

Uh oh!

dustinblack left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 7, 2026

Uh oh!

mcurrier2 commented Apr 7, 2026

Uh oh!

sberg-rh commented Apr 7, 2026

Uh oh!

github-actions Bot commented Apr 10, 2026

📈 Changed lines coverage: 88.89% (56/63)

🚨 Uncovered lines in this PR

📊 Code Coverage Summary

Uh oh!

mcurrier2 commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix 1 — Wall-clock timestamp captured immediately (blocking path, same pattern in async):

Fix 2 — Trim input for Windows line endings:

Uh oh!

dustinblack left a comment

Choose a reason for hiding this comment

Uh oh!

sberg-rh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mcurrier2 commented Mar 11, 2026 •

edited

Loading

Unit tests for `parse_latency_file_line` (7 tests)

Pre-existing timestamp API tests (3 tests in `results.rs`)

mcurrier2 commented Apr 10, 2026 •

edited

Loading