feat: venue decoder crates + SPSC benchmark overhaul by Milerius · Pull Request #28 · Milerius/Mantis

Milerius · 2026-04-07T06:42:59Z

Summary

Venue decoder crates: Add mantis-binance and mantis-polymarket crates with zero-alloc market data decoders
Fixed-point parsing: Add parse_decimal_bytes for zero-alloc hot-path price parsing
Transport refactor: Change feed callback from &str to &mut [u8] for in-place parsing
SPSC benchmark overhaul: Replace single-thread benchmarks with proper two-thread cross-core latency measurement
Queue performance optimizations: Colocated cache lines, bool API, BranchWrap, inline(always)

SPSC Benchmark Results (AMD Ryzen 7 PRO 8700GE, isolated cores)

Queue	Language	Mean (TSC cycles)	Stability
Mantis SpscRingFast (BranchWrap)	Rust	472	+/-8
Mantis SpscRing (Pow2Masked)	Rust	554	+/-14
Mantis SpscRingCopy	Rust	552	+/-11
rigtorp::SPSCQueue	C++	510 best / 1315 median	+/-400
rtrb	Rust	1090 best / 2345 median	+/-750

At 3.65 GHz TSC, 472 cycles = 129 ns per cross-core message handoff.

Protocol: two threads on separate pinned cores, isolcpus, rdtsc with lfence, 48-byte messages, 1M operations, chrt -f 99, taskset -c 2,3. See benchmarks/RESULTS.md for full methodology.

Key optimizations that closed the Rust-C++ gap

Colocated cache lines — head+tail_cached on same 64B line (~400 cycles saved)
Bool API — push() -> bool + pop(*mut T) -> bool instead of Result<T> (~500 cycles vs rtrb)
&self shared references — push_shared/pop_shared avoid LLVM noalias interference
BranchWrap — branch-predictor-friendly index wrapping (~85 cycles vs AND mask)
inline(always) + LTO — guaranteed cross-crate inlining of hot paths

Test plan

cargo +nightly test — workspace tests pass
cargo +nightly clippy --all-targets --features alloc,std -- -D warnings — clean
cargo +nightly fmt --all --check — formatted
cd benchmarks/rust && cargo +nightly build --release — standalone bench builds
Remaining criterion benchmarks (fixed, seqlock, market_state, decode) still run

🤖 Generated with Claude Code

…ion simulator Replace "BBO cross = fill" with queue-aware fill model: - SimulatedExchange with FIFO queue tracking per price level - Latency injection (configurable profiles: optimistic/realistic/pessimistic) - Partial fills (trade_qty minus queue_ahead) - Cancel/repost loses queue priority - Post-only safety check at arrival-time book state - Markout tracking (adverse selection at 100ms/500ms/1s/5s) - PnL breakdown: execution vs MTM vs rebate estimate (rebates separate) - Two modes only: dry_run (realistic simulation) and live Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…rsing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Benchmarks parse_decimal_bytes against rust_decimal, fixed crate, fast-float, lexical-core, and stdlib f64 parse. Tests Polymarket short prices (0.53) and Binance medium prices (67396.70). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Required for simd-json in-place parsing. read_text() becomes read_bytes() writing into a reusable Vec<u8> (allocated once on IO thread). Callback receives &mut [u8] — decoders can call simd_json::from_slice directly without copying. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add workspace dependency entries (serde, serde_json, simd-json, tracing) and create the mantis-binance crate with zero-allocation BinanceBookTicker serde schema and spawn stub. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Zero-allocation decoder converting Binance bookTicker JSON into HotEvent TopOfBook values using FixedI64 parse_decimal_bytes and InstrumentMeta tick/lot conversion. Supports simd-json feature gate with serde_json fallback. Includes 4 tests covering happy path, seq increment, malformed input, and empty input. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add mantis-polymarket workspace dependency and crate with zero-allocation serde schema types for Polymarket book, price_change, and trade messages. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ice_change Add registry-backed decoder that converts Polymarket WS messages into HotEvent values with peek_type() for simd-json-safe type dispatch. Includes 11 tests covering all message types, unknown tokens, and malformed input. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-04-07T06:43:18Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

This PR removes legacy SPSC Criterion benchmarks and C++/Rust FFI contenders, adds a standalone cross-language benchmarks suite (Rust + C++), introduces new Binance and Polymarket decoder crates, changes transport WebSocket text APIs to a mutable byte-buffer API, adds byte-slice decimal parsing, and supplies many benchmark tooling scripts and CI/workflow adjustments.

Changes

Cohort / File(s)	Summary
CI / Workflow `.github/workflows/bench.yml`, `.github/scripts/bench-report.sh`	Removed SPSC job steps and SPSC JSON inputs; bench-report invocation simplified to omit SPSC parameters; report-generation/normalization logic reduced to grouped renderings.
Workspace & top-level config `Cargo.toml`, `crates/bench/Cargo.toml`	Excluded `benchmarks/rust` from workspace member resolution; added workspace crates/deps (mantis-binance, mantis-polymarket, serde*, simd-json, tracing, fast-float2, lexical-core); removed SPSC bench feature and replaced with `decode`.
Removed SPSC infra & FFI `crates/bench/benches/spsc.rs`, `crates/bench/src/messages.rs`, `crates/bench/src/workloads.rs`, `crates/bench/src/{drogalis_ffi.rs,rigtorp_ffi.rs}`, `crates/bench/cpp/*_ffi.cpp`	Deleted legacy SPSC Criterion bench, message definitions, workloads, and C++/Rust FFI wrappers for rigtorp/drogalis.
New standalone benchmarks (Rust) `benchmarks/rust/Cargo.toml`, `benchmarks/rust/src/`, `benchmarks/rust/src/queues/`	Added Rust SPSC bench crate: harness, Message48, rdtsc, CycleHistogram/stats, queue adapters (mantis-inline/mantis-copy/rtrb), and a CLI runner.
New standalone benchmarks (C++) `benchmarks/cpp/CMakeLists.txt`, `benchmarks/cpp/src/`, `benchmarks/cpp/src/queues/`	Added C++ SPSC bench: CMake target, harness, Message48, rdtsc, CycleHistogram, and adapters for rigtorp/drogalis with CLI output JSON.
Bench tooling & scripts `benchmarks/README.md`, `benchmarks/.gitignore`, `benchmarks/scripts/*`	Added README, .gitignore, and many scripts (run_bench, setup_machine, prepare_system, check_system, perf_*, compare.py, deploy_and_run, etc.).
Bench decoding & fixed-point benches `crates/bench/benches/decode.rs`, `crates/bench/benches/fixed.rs`, `crates/bench/build.rs`	Added decode and fixed-point benchmarks; removed build logic for deleted C++ FFI contenders and adjusted features.
Binance decoder crate `crates/binance/Cargo.toml`, `crates/binance/src/{lib.rs,decoder.rs,schema.rs,spawn.rs}`	New `mantis-binance` crate with zero-allocation schema and `BinanceDecoder` (optional simd-json); tests included.
Polymarket decoder crate `crates/polymarket/Cargo.toml`, `crates/polymarket/src/market/*`, `crates/polymarket/src/lib.rs`	New `mantis-polymarket` crate with `PolymarketMarketDecoder`, schema, registry usage, and tests.
Transport API & read path changes `crates/transport/src/ws.rs`, `crates/transport/src/feed.rs`, `crates/transport/src/binance/reference.rs`, `crates/transport/src/polymarket/market.rs`, `crates/transport/tests/*`	Switched text-based API (`read_text` / `FnMut(&str)`) to buffer-based API (`read_bytes(&mut Vec<u8>) -> usize` / `FnMut(&mut [u8])`); added reusable read buffer and updated tests and live logging.
Decimal parsing `crates/fixed/src/parse.rs`	Added const `parse_decimal_bytes(&[u8])` and made `from_str_decimal` delegate to it; extended tests.
Transport dependency tweak `crates/transport/Cargo.toml`	Changed `rustls` feature from `ring` to `aws_lc_rs`.
Tests / logging adjustments `crates/transport/tests/live_feeds.rs`, `crates/transport/tests/feed_thread.rs`	Updated test closures to accept `&mut [u8]` and improved live-test logging to print up to 120 UTF‑8 bytes or `"<binary>"`.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant FeedThread
    participant WsConnection
    participant Decoder
    participant Results
    rect rgba(200,200,255,0.5)
    Client->>FeedThread: spawn(on_message: FnMut(&mut [u8]))
    end
    FeedThread->>WsConnection: read_bytes(&mut buffer)
    WsConnection-->>FeedThread: bytes_read (usize)
    alt bytes_read > 0
        FeedThread->>Decoder: pass &mut buffer (in-place parse)
        Decoder-->>FeedThread: events / continue? (bool)
        FeedThread->>Results: record histograms / write JSON
    else bytes_read == 0
        FeedThread-->>FeedThread: no-data / continue loop
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat(transport): add mantis-transport crate with blocking WS feed threads #19: Transport feed/ws API changes — same read/callback signature updates and transport code paths.
feat: add mantis-seqlock — lock-free sequence lock #20: CI/workflow and bench-report edits — overlaps bench-report invocation and artifact handling.
feat: add mantis-market-state — venue-agnostic market state engine #21: Benchmark CI/reporting pipeline adjustments — related merging/normalization of benchmark JSON inputs.

Suggested reviewers

brohamgoham

Poem

🐰 I hopped through crates and scripts tonight,

moved benches out where Rust and C++ write;
bytes now dance where strings once used to sing,
histograms hum and decoders bring spring.
Hooray — CI’s carrots gleam and CI bell rings!

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'feat: venue decoder crates + SPSC benchmark overhaul' accurately and concisely summarizes the main changes: adding venue decoder crates and overhauling the SPSC benchmark infrastructure.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/venue-decoder-crates

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…module Add benchmarks/rust/ as a standalone binary crate (excluded from workspace) with CycleHistogram (4096 direct + 20 log-bucketed overflow), percentile computation, and serializable BenchResult/LatencyResults/SystemInfo structs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…re suite Replace all single-threaded push+pop benchmarks with proper two-thread latency measurement using core pinning and rdtsc timestamping. - Add benchmarks/rust/: standalone Rust binary with mantis-inline, mantis-copy, and rtrb adapters, real histogram percentiles, JSON output - Add benchmarks/cpp/: standalone C++ binary with native rigtorp and drogalis adapters (no FFI), same protocol and JSON schema - Add benchmarks/scripts/: system validation (check_system, prepare_system), perf profiling (perf_profile, perf_cache, perf_branch, perf_flamegraph), orchestration (run_bench, setup_machine, deploy_and_run), and compare.py - Remove crates/bench/benches/spsc.rs and all FFI contender code - Remove crossbeam (MPMC) from SPSC comparisons Addresses methodology flaws identified by HFT University article: single-thread measurement, FFI-handicapped contenders, forced copies, MPMC-vs-SPSC confusion, and Gaussian percentile approximation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

github-actions · 2026-04-07T06:44:20Z

Benchmark Report

_{Commit: 47af874026ddb97aa862ef81bc5eb1a10c2a0974}

Sequence Lock (mantis-seqlock)

Linux

CPU: AMD EPYC 7763 64-Core Processor | Arch: x86_64 | Compiler: rustc 1.96.0-nightly (bcded3316 2026-04-06)

seqlock_read_contended

Variant	ns/op
`u64`	38.91
`msg64`	405.86

seqlock_read_uncontended

Variant	ns/op
`u64`	0.63
`msg64`	1.42
`msg128`	2.58

seqlock_write

Variant	ns/op
`u64`	1.56
`msg64`	1.87
`msg128`	3.12

macOS

CPU: Apple M1 (Virtual) | Arch: arm64 | Compiler: rustc 1.96.0-nightly (bcded3316 2026-04-06)

seqlock_read_contended

Variant	ns/op
`u64`	77.22
`msg64`	82.35

seqlock_read_uncontended

Variant	ns/op
`u64`	1.54
`msg64`	2.43
`msg128`	5.18

seqlock_write

Variant	ns/op
`u64`	7.77
`msg64`	8.65
`msg128`	9.25

Fixed-Point Arithmetic (mantis-fixed)

Linux

CPU: AMD EPYC 7763 64-Core Processor | Arch: x86_64 | Compiler: rustc 1.96.0-nightly (bcded3316 2026-04-06)

fixed_checked_add

Variant	ns/op
`FixedI64_6_`	1.25
`raw_i64`	1.25

fixed_checked_div

Variant	ns/op
`trunc`	4.98
`round`	5.92

fixed_checked_mul_trunc

Variant	ns/op
`D=4`	3.91
`D=2`	4.14
`D=6`	4.34
`D=8`	4.35

fixed_decimal_parse

Variant	ns/op
`f64_roundtrip_short_0.53`	12.93
`f64_roundtrip_medium_67396.70`	15.29
`mantis_bytes_short_0.53`	20.65
`mantis_str_short_0.53`	20.65
`mantis_bytes_integer_67396`	21.27
`mantis_bytes_medium_67396.70`	30.74
`mantis_str_medium_67396.70`	30.75
`mantis_bytes_long_0.00012345`	35.27

fixed_display

Variant	ns/op
`FixedI64_6_`	49.43

fixed_mul_round_vs_trunc

Variant	ns/op
`trunc`	4.35
`round`	13.41

fixed_parse

Variant	ns/op
`short`	18.84
`integer_only`	24.86
`full_precision`	34.77

fixed_rescale

Variant	ns/op
`D2_to_D8_widen`	0.63
`D6_to_D2_trunc`	0.92

macOS

CPU: Apple M1 (Virtual) | Arch: arm64 | Compiler: rustc 1.96.0-nightly (bcded3316 2026-04-06)

fixed_checked_add

Variant	ns/op
`FixedI64_6_`	1.42
`raw_i64`	1.48

fixed_checked_div

Variant	ns/op
`trunc`	4
`round`	4.78

fixed_checked_mul_trunc

Variant	ns/op
`D=8`	2.08
`D=6`	2.32
`D=4`	2.5
`D=2`	3.1

fixed_decimal_parse

Variant	ns/op
`f64_roundtrip_short_0.53`	10.2
`f64_roundtrip_medium_67396.70`	12.87
`mantis_bytes_short_0.53`	16
`mantis_str_short_0.53`	18.17
`mantis_bytes_integer_67396`	19.83
`mantis_bytes_medium_67396.70`	22.89
`mantis_str_medium_67396.70`	28.67
`mantis_bytes_long_0.00012345`	35.51

fixed_display

Variant	ns/op
`FixedI64_6_`	42.93

fixed_mul_round_vs_trunc

Variant	ns/op
`trunc`	2.14
`round`	2.59

fixed_parse

Variant	ns/op
`short`	14.74
`integer_only`	20.81
`full_precision`	29.91

fixed_rescale

Variant	ns/op
`D2_to_D8_widen`	0.57
`D6_to_D2_trunc`	0.6

Market-State Engine (mantis-market-state)

Linux

CPU: AMD EPYC 7763 64-Core Processor | Arch: x86_64 | Compiler: rustc 1.96.0-nightly (bcded3316 2026-04-06)

market_state_array_book

Variant	ns/op
`best_bid`	0.94
`apply_delta`	1.87

market_state_engine

Variant	ns/op
`micro_price`	3.13
`process_delta_mid_batch`	4.06
`process_delta_batch_end`	19.61
`book_imbalance_5`	162.92

macOS

CPU: Apple M1 (Virtual) | Arch: arm64 | Compiler: rustc 1.96.0-nightly (bcded3316 2026-04-06)

market_state_array_book

Variant	ns/op
`best_bid`	0.8
`apply_delta`	1.49

market_state_engine

Variant	ns/op
`micro_price`	2.49
`process_delta_mid_batch`	3.39
`process_delta_batch_end`	37.69
`book_imbalance_5`	147.92

These files were committed before the .gitignore rule was added. Remove them from tracking — they remain on disk but are ignored. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Update fastrand 2.4.0 -> 2.4.1 (yanked crate) - Fix live_feeds test: use str::from_utf8 for &mut [u8] callback Display - Fix mantis-binance doc: remove intra-doc link to HotEvent::TopOfBook Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codecov · 2026-04-07T06:57:46Z

Codecov Report

❌ Patch coverage is 92.48879% with 67 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.57%. Comparing base (8cf8901) to head (1c472ad).
⚠️ Report is 41 commits behind head on main.

Files with missing lines	Patch %	Lines
crates/polymarket/src/market/decoder.rs	94.97%	17 Missing and 6 partials ⚠️
crates/queue/src/handle.rs	0.00%	18 Missing ⚠️
crates/queue/src/copy_ring/mod.rs	0.00%	6 Missing ⚠️
crates/queue/src/presets.rs	0.00%	6 Missing ⚠️
crates/binance/src/decoder.rs	97.31%	5 Missing ⚠️
crates/platform/src/pad.rs	76.47%	3 Missing and 1 partial ⚠️
crates/transport/src/ws.rs	70.00%	3 Missing ⚠️
crates/queue/src/engine.rs	98.80%	1 Missing ⚠️
crates/transport/src/polymarket/market.rs	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #28      +/-   ##
==========================================
- Coverage   95.84%   95.57%   -0.27%     
==========================================
  Files          75       77       +2     
  Lines        9183    10009     +826     
  Branches     9183    10009     +826     
==========================================
+ Hits         8801     9566     +765     
- Misses        338      393      +55     
- Partials       44       50       +6

Files with missing lines	Coverage Δ
crates/core/src/lib.rs	`91.54% <100.00%> (+1.71%)`	⬆️
crates/fixed/src/parse.rs	`94.22% <100.00%> (+1.24%)`	⬆️
crates/queue/src/copy_ring/engine.rs	`99.68% <100.00%> (ø)`
crates/queue/src/copy_ring/raw/mod.rs	`95.45% <ø> (ø)`
crates/queue/src/raw/mod.rs	`100.00% <100.00%> (ø)`
crates/queue/src/raw/slot.rs	`100.00% <100.00%> (ø)`
crates/queue/src/storage.rs	`89.47% <ø> (ø)`
crates/transport/src/binance/reference.rs	`91.48% <100.00%> (ø)`
crates/transport/src/feed.rs	`89.14% <100.00%> (+0.08%)`	⬆️
crates/queue/src/engine.rs	`99.45% <98.80%> (-0.55%)`	⬇️
... and 8 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ring 0.17 fails to compile on nightly + macOS Xcode 16.4. The new venue decoder crates (binance, polymarket) pull ring into the bench crate transitively. Switch to aws-lc-rs which compiles cleanly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Remove dead SPSC normalize/render functions from bench-report.sh - Update positional args (no longer expects spsc report as $1/$2) - Fix bench.yml to pass correct args to bench-report.sh - Remove stale comment in criterion-to-json.sh Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 14

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/transport/src/polymarket/market.rs (1)

3-5: ⚠️ Potential issue | 🟡 Minor

Update docs to match byte-slice callback API.

The docs still describe “text frames,” but the callback now receives &mut [u8]. Please update wording to avoid API confusion.

✏️ Suggested doc fix

-//! forwards raw JSON text frames via the `FeedThread` callback.
+//! forwards raw JSON byte frames via the `FeedThread` callback.
@@
-/// The callback receives each raw JSON text frame (excluding `"PONG"`
+/// The callback receives each raw JSON byte frame (excluding `"PONG"`

Also applies to: 41-43

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/transport/src/polymarket/market.rs` around lines 3 - 5, Update the
module docs in crates/transport/src/polymarket/market.rs to describe the
callback API as receiving raw byte slices rather than “text frames”: replace
references to “text frames” with wording like “raw JSON byte slices (&mut [u8])”
and mention that the FeedThread callback is invoked with a mutable byte slice
containing the JSON payload; also apply the same wording change to the other doc
occurrences around lines 41-43 to avoid API confusion.

🧹 Nitpick comments (10)

crates/fixed/src/parse.rs (1)

93-135: Orphaned documentation block.

Lines 93-112 contain documentation that was originally for from_str_decimal, but this doc block is now orphaned — it precedes the new parse_decimal_bytes documentation (lines 113-135) and doesn't attach to any function. The from_str_decimal function at line 270 has its own documentation.

Consider removing lines 93-112 to avoid documentation duplication and confusion.

♻️ Proposed fix

 #[expect(
     clippy::cast_possible_truncation,
     reason = "i128-to-i64 cast is guarded by range check"
 )]
 impl<const D: u8> FixedI64<D> {
-    /// Parse a decimal string like `"1.23"` or `"-0.5"` into `FixedI64<D>`.
-    ///
-    /// Accepts:
-    /// - Optional sign prefix (`-` or `+`)
-    /// - Integer-only: `"123"`
-    /// - Decimal: `"1.23"`, `".5"`, `"0.5"`
-    /// - Leading zeros: `"007.50"`
-    ///
-    /// Rejects:
-    /// - Empty strings
-    /// - Exponent notation (`"1e5"`)
-    /// - Double dots (`"1..2"`)
-    /// - Non-digit characters
-    /// - Fractional digits exceeding D
-    ///
-    /// # Errors
-    ///
-    /// Returns [`ParseFixedError::InvalidFormat`] for malformed input,
-    /// [`ParseFixedError::Overflow`] if the value exceeds `i64` range,
-    /// or [`ParseFixedError::ExcessPrecision`] if fractional digits exceed D.
     /// Parse a decimal byte slice like `b"1.23"` or `b"-0.5"` into `FixedI64<D>`.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/fixed/src/parse.rs` around lines 93 - 135, Remove the
duplicated/orphaned doc block that describes from_str_decimal which currently
sits immediately before the parse_decimal_bytes docs (the earlier block that
starts the duplicate description); delete that orphaned documentation so only
the parse_decimal_bytes doc (and the existing from_str_decimal doc attached to
the from_str_decimal function) remain, ensuring no duplicate or unattached docs
are left around parse_decimal_bytes or from_str_decimal.

benchmarks/scripts/prepare_system.sh (2)

87-88: Avoid variables in printf format string.

Using variables directly in the printf format string can cause issues if they contain % characters. Use %b to interpret escape sequences safely.

🧹 Proposed fix

-printf "\n${GRN}System prepared for benchmarking.${RST}\n"
+printf "\n%bSystem prepared for benchmarking.%b\n" "$GRN" "$RST"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@benchmarks/scripts/prepare_system.sh` around lines 87 - 88, The printf calls
printing colored status messages use variables directly in the format string
which can break if they contain % sequences; update the two printf invocations
(the lines that currently call printf "\n${GRN}System prepared for
benchmarking.${RST}\n" and printf "Run ./check_system.sh to verify.\n") to use a
safe format specifier (e.g. "%b\n" or "%s\n") and pass the composed string or
variables as arguments so escape sequences and % chars are interpreted safely
rather than treated as format directives.

18-21: Remove unused RED variable.

The RED variable is defined but never used in the script. This is flagged by shellcheck (SC2034).

🧹 Proposed fix

-RED='\033[0;31m'
 GRN='\033[0;32m'
 BLD='\033[1m'
 RST='\033[0m'

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@benchmarks/scripts/prepare_system.sh` around lines 18 - 21, Remove the unused
RED variable definition: delete the RED='\033[0;31m' line (symbol RED) from the
script so only the used color variables (GRN, BLD, RST) remain; confirm there
are no other references to RED elsewhere in the script and run shellcheck to
verify SC2034 is resolved.

benchmarks/rust/src/queues/mantis_inline.rs (1)

69-74: Match the empty-pop error explicitly.

Err(_) => false treats every try_pop failure as "queue empty". If mantis_queue already has, or later adds, a non-empty error case, the harness will spin forever instead of surfacing it.

As per coding guidelines: "No wildcard matches — use explicit destructuring in pattern matching".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@benchmarks/rust/src/queues/mantis_inline.rs` around lines 69 - 74, The match
on self.0.try_pop() uses a wildcard Err(_) which hides non-empty errors; change
it to explicitly match the queue-empty variant (e.g., Err(<EmptyVariant>) =>
false) and handle any other Err(e) separately (for example by panicking or
returning/propagating the error) so unexpected errors are surfaced. Update the
match arms around the call to self.0.try_pop() to use the concrete empty variant
from the mantis_queue error enum and a separate Err(e) arm that
logs/panics/returns the error.

crates/polymarket/src/market/schema.rs (1)

1-1: Doc comment slightly overstates "zero-allocation".

The Vec<BookLevel<'a>> fields in PolymarketBookMsg allocate heap memory for the level vectors. Consider updating to "low-allocation" or "zero-copy string" to be more precise, since the borrowed &'a str fields do avoid per-field string allocations.

📝 Suggested doc update

-//! Zero-allocation serde structs for Polymarket WebSocket messages.
+//! Low-allocation serde structs for Polymarket WebSocket messages.
+//!
+//! String fields borrow from the input buffer; only `Vec` fields allocate.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/polymarket/src/market/schema.rs` at line 1, The crate doc comment
claims "Zero-allocation serde structs", but PolymarketBookMsg contains
Vec<BookLevel<'a>> which allocates heap memory; change the module-level doc to a
more accurate phrase like "low-allocation" or "zero-copy string serde structs"
and mention that fields use borrowed & 'a str to avoid per-field string
allocations; update references to PolymarketBookMsg and BookLevel<'a> in the
comment to reflect this more precise wording.

benchmarks/cpp/src/harness.hpp (1)

75-106: consumer_done atomic is set but never read.

The consumer_done flag is stored on line 106 but never checked. The consumer_thread.join() on line 130 already ensures the consumer completes. Consider removing consumer_done to simplify the code.

🧹 Suggested cleanup

     CycleHistogram result_hist;
-    std::atomic<bool> consumer_done{false};

     // Spawn consumer thread
     std::thread consumer_thread([&]() {
         // ... existing code ...

         result_hist = histogram;
-        consumer_done.store(true, std::memory_order_release);
     });

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@benchmarks/cpp/src/harness.hpp` around lines 75 - 106, The consumer_done
atomic is written by the consumer thread but never read anywhere; remove the
unused variable and its stores to simplify the logic: delete the
std::atomic<bool> consumer_done declaration and remove the
consumer_done.store(true, std::memory_order_release) call in the consumer thread
body (leave consumer_thread.join() and existing result_hist assignment intact),
ensuring no other code references consumer_done before committing.

benchmarks/cpp/src/main.cpp (2)

56-60: atoi/atoll silently return 0 on invalid input.

Invalid numeric arguments (e.g., --messages abc) will be silently accepted as 0, which could lead to confusing benchmark behavior. Consider using strtoul/strtoull with error checking, or at minimum validate that parsed values are positive where expected.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@benchmarks/cpp/src/main.cpp` around lines 56 - 60, The parsing of numeric CLI
args (when match(...) and next() set args.producer_core, args.consumer_core,
args.messages, args.warmup, args.runs) uses atoi/atoll which silently yields 0
on invalid input; replace these with strtoul/strtoull calls, capture the endptr
and errno, validate that endptr consumed the whole token and the result is in
the expected range (>0 for counts/cores), and on error emit a clear message and
exit or set a safe default. Ensure you cast the validated unsigned long/unsigned
long long to the target types (unsigned / uint64_t) only after checking for
overflow/underflow.

91-97: Missing p99.99 percentile in output.

The summary prints p999 (99.9th percentile) but the comment mentions "percentile computation via histograms" with p99.99. Consider adding the p99.99 percentile for consistency with typical latency reporting, or clarify if this is intentional.

📊 Suggested enhancement

-    fprintf(stderr, "  run %u/%u: p50=%lu p99=%lu p999=%lu max=%lu mean=%.1f cycles/op\n",
+    fprintf(stderr, "  run %u/%u: p50=%lu p99=%lu p999=%lu p9999=%lu max=%lu mean=%.1f cycles/op\n",
             run, args.runs,
             static_cast<unsigned long>(hist.percentile(50.0)),
             static_cast<unsigned long>(hist.percentile(99.0)),
             static_cast<unsigned long>(hist.percentile(99.9)),
+            static_cast<unsigned long>(hist.percentile(99.99)),
             static_cast<unsigned long>(hist.max()),
             hist.mean());

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@benchmarks/cpp/src/main.cpp` around lines 91 - 97, The fprintf summary
currently prints p50/p99/p999/max/mean using the fprintf call that references
hist.percentile(50.0), hist.percentile(99.0), hist.percentile(99.9), hist.max(),
and hist.mean(); update that fprintf invocation to include the 99.99th
percentile by calling hist.percentile(99.99) and adding a corresponding format
token and label (e.g., add "p99.99=%lu" or rename labels to clarify p99.9 vs
p99.99) so the arguments list includes static_cast<unsigned
long>(hist.percentile(99.99)) alongside the existing percentile casts (still
using run and args.runs, hist.max(), hist.mean()).

benchmarks/scripts/run_bench.sh (1)

29-39: Argument parsing can fail silently on missing values at end of args.

If the last argument is a flag like --messages without a value, shift 2 will fail due to set -u, but the error message from next() won't be shown since this is bash, not the C++ parser. This is acceptable given set -euo pipefail, but consider adding validation after parsing to ensure numeric args are valid.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@benchmarks/scripts/run_bench.sh` around lines 29 - 39, The argument parsing
loop can leave flags like --messages without values (causing silent failures
under set -euo pipefail); after the while/case block, add explicit validation
for the parsed variables (PRODUCER_CORE, CONSUMER_CORE, MESSAGES, WARMUP, RUNS)
to ensure they are set and contain valid integers (e.g., non-empty and matching
/^[0-9]+$/), and call usage (the existing usage function) with an error if any
check fails; also consider validating that no trailing single flag remained by
checking "$#" or the raw args before shifting to surface a clear error message
for missing values.

crates/bench/benches/decode.rs (1)

52-65: Consider adding throughput measurement for better cross-hardware comparison.

Per coding guidelines, benchmarks should export JSON for cross-hardware comparison. While Criterion handles JSON export, adding throughput annotations would make comparisons more meaningful across different hardware.
📈 Suggested enhancement
 fn bench_binance_decode(c: &mut Criterion) {
     let mut group = c.benchmark_group("decode/binance_book_ticker");
+    let json_size = binance_json().len();
+    group.throughput(criterion::Throughput::Bytes(json_size as u64));

     group.bench_function("full_pipeline", |b| {
Based on learnings: "All benchmarks must export JSON for cross-hardware comparison"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/bench/benches/decode.rs` around lines 52 - 65, The benchmark lacks a
throughput annotation for cross-hardware comparison; update bench_binance_decode
to call group.throughput(...) before group.bench_function, using an appropriate
Throughput unit (e.g., Throughput::Bytes with the JSON buffer size or
Throughput::Elements for per-item work) computed from binance_json() (or a
constant representing its length), and ensure the Throughput type is imported so
Criterion will include throughput in its JSON export when running the
full_pipeline benchmark.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@benchmarks/cpp/src/stats.hpp`:
- Around line 22-39: The overflow bucket upper bound currently returns the
exclusive top of the bucket (and overshoots by one), so update
overflow_upper_bound to return the inclusive upper bound (i.e. (DIRECT_BUCKETS
<< (i+1)) - 1) for i < LOG_BUCKETS-1 and still return
numeric_limits<uint64_t>::max() for the final bucket; keep using DIRECT_BUCKETS
and LOG_BUCKETS to locate the calculation. Also review log_bucket usage to
ensure it is only called for values >= 4096 (shifted != 0) and leave its
bit-computation logic as-is; this guarantees percentiles coming from overflow_
report the bucket’s true inclusive top rather than an off-by-one/exclusive
value.
- Around line 3-10: Add the missing header for std::min: include <algorithm> at
the top of the file (stats.hpp) so std::min is available; update the existing
include block that currently has <array>, <cmath>, <cstdint>, <cstdio>,
<fstream>, <limits>, <sstream>, <string> to also include <algorithm> to avoid
relying on transitive includes.

In `@benchmarks/README.md`:
- Around line 97-103: The README example for cache profiling is missing the
required benchmark command argument for scripts/perf_cache.sh; update the second
code block so the perf_cache.sh invocation includes the benchmark command
placeholder (e.g., append " -- <bench-command...>" or a concrete example like "
-- ./scripts/bench.sh" after the flags) so that scripts/perf_cache.sh receives
the expected <bench-command...>; reference the scripts/perf_cache.sh invocation
in the README and mirror the same structure used in the perf_profile.sh example.

In `@benchmarks/rust/src/harness.rs`:
- Around line 13-20: The pin_to_core function should be rewritten to return
Result<(), String> and use let...else (or ok_or/ok_or_else) to early-return on
failures when calling core_affinity::get_core_ids and finding the target core,
and must check the bool result of core_affinity::set_for_current and return an
Err if it fails; update callers (where consumer and producer threads call
pin_to_core) to propagate or handle the Result (do not silently ignore
failures), and add a validation check where producer_core and consumer_core are
parsed/used to ensure they are distinct and return an error if they are equal so
cross-core measurements are not run with the same core IDs.

In `@benchmarks/rust/src/rdtsc.rs`:
- Around line 7-10: The unsafe block calling core::arch::x86_64::_mm_lfence()
and core::arch::x86_64::_rdtsc() lacks the required safety rationale; add a
single-line comment starting with "// SAFETY:" immediately above the unsafe
block that documents the invariants (this code relies on x86_64 architecture and
availability of RDTSC/LFENCE, requires caller to run on that ISA, and that these
intrinsics do not violate memory or UB assumptions), what guarantee is being
upheld (ordering and timestamp read semantics), and the failure mode if the
invariant is not met (undefined behavior or incorrect timing on non-x86_64).
Ensure the comment references the intrinsics _mm_lfence and _rdtsc so reviewers
can map the rationale to the unsafe calls.

In `@benchmarks/rust/src/stats.rs`:
- Around line 56-83: The percentile() routine currently returns the
exclusive/log-bucket upper bound (via overflow_upper_bound) which overstates
values (e.g. 5000→8192) and is off-by-one for documented buckets; update
percentile() so that when walking overflow (the overflow array / LOG_BUCKETS) it
returns the inclusive upper bound (overflow_upper_bound(i) - 1) and clamp that
value by self.max (use saturating_sub/min to avoid underflow), and similarly
apply the same inclusive-bound logic to the other percentile code path noted
(the duplicate at lines ~115-122) so exported tail latencies reflect the real
bucket value range (e.g., bucket 0 -> 4096..8191).

In `@benchmarks/scripts/compare.py`:
- Around line 28-33: group_by_implementation currently groups solely by
r["implementation"] which can mix runs with different metadata; update it to
either validate that key config fields (e.g., r["language"], r["capacity"],
r["message_size"], r["pinning"], r["host_cpu"] or similar keys present in
results) are identical for all entries in a group and raise an error if they
mismatch, or include those fields in the grouping key so runs are only collapsed
when all relevant config fields match; also apply the same validation/adjustment
where print_system_info or any other grouping logic aggregates by implementation
to ensure the “Runs per implementation” and reported CPU/core/message metadata
correspond to homogeneous benchmark configs.

In `@benchmarks/scripts/deploy_and_run.sh`:
- Around line 70-73: The current rsync pulls all remote results into a single
directory and causes overwrites; update the target path to include a
host-specific namespace by creating and syncing into a host subdirectory (use
BENCH_DIR, REMOTE_HOST and REMOTE_DIR from the script) — e.g., ensure mkdir -p
"$BENCH_DIR/results/$REMOTE_HOST" and change the rsync destination to
"$BENCH_DIR/results/$REMOTE_HOST/" (alternatively, prepend REMOTE_HOST to
artifact filenames on the remote side before pulling) so each host’s results are
isolated.
- Around line 63-65: The remote invocation currently collapses BENCH_ARGS into a
single string causing re-splitting and injection; change the call so you do not
expand BENCH_ARGS into the remote shell string but instead stream/exec the
run_bench.sh on the remote host and pass BENCH_ARGS as separate positional
parameters so they arrive unchanged. Concretely, stop using "${BENCH_ARGS[*]}"
in the ssh command that references REMOTE_HOST and REMOTE_DIR and instead invoke
the remote bash with the script fed via stdin (or executed remotely) while
supplying BENCH_ARGS as separate arguments so run_bench.sh receives them as
"$@"; ensure the remote invocation uses the standard "--" convention to
terminate options and still performs the cd/chmod steps before running
run_bench.sh.

In `@benchmarks/scripts/run_bench.sh`:
- Around line 117-120: Add an existence check for the compare.py script before
invoking it: test for the file at "$SCRIPT_DIR/compare.py" (using -f or -x as
appropriate) and if missing print a clear error to STDERR and exit non‑zero;
otherwise run python3 "$SCRIPT_DIR/compare.py" "$RESULTS_DIR" | tee
"$RESULTS_DIR/comparison.md". Reference the existing variables SCRIPT_DIR and
RESULTS_DIR and the compare.py filename so the check is colocated with the
current Step 7 invocation.

In `@benchmarks/scripts/setup_machine.sh`:
- Around line 11-25: The install_packages function currently omits git and curl
so subsequent curl and git clone calls will fail; update install_packages to
include git and curl in the package lists for each package manager branch
(apt-get: add git and curl to the sudo apt-get install line and linux-tools
list, dnf: add git and curl to sudo dnf install, pacman: add git and curl to
sudo pacman -Sy), and also update the fallback warning to mention git and curl
as required tools so a manual install reminder is accurate.

In `@crates/polymarket/src/market/decoder.rs`:
- Around line 61-65: The dispatcher currently treats "tick_size_change" as
unsupported (falls to _ => 0) so tick-size updates never reach the decoder; add
a branch for "tick_size_change" that calls a new handler (e.g.
self.decode_tick_size_change(buf, recv_ts, out)) in the same match where
decode_price_change/decode_trade/decode_book are called, and mirror that change
in the other dispatcher instance referenced around lines 516-521. Implement
decode_tick_size_change to parse the tick-size payload and update the instrument
metadata/state used by price_to_ticks/qty_to_lots (or notify the instrument
registry), ensuring subsequent conversions use the new tick size. Use the
existing function names (decode_price_change, decode_trade, decode_book) as
references when adding the "tick_size_change" branch and implement a
decode_tick_size_change method that updates the shared instrument map/state.
- Around line 170-245: The code currently computes `total` from raw JSON levels
and sets EventFlags::LAST_IN_BATCH inside the `for` loops using that raw count,
which misses cases where `parse_price_qty::<D>` continues or emission stops at
64; instead, stop setting LAST_IN_BATCH when constructing each
`HotEvent::book_delta` (only use EventFlags::IS_SNAPSHOT there), and after both
loops, if `count > 0` set the LAST_IN_BATCH flag on the last emitted event (e.g.
mutate or rebuild `out[count-1]` to OR in EventFlags::LAST_IN_BATCH); refer to
`parse_price_qty::<D>`, `HotEvent::book_delta`, `EventFlags::LAST_IN_BATCH`,
`out`, and `count` to locate and implement this change.

---

Outside diff comments:
In `@crates/transport/src/polymarket/market.rs`:
- Around line 3-5: Update the module docs in
crates/transport/src/polymarket/market.rs to describe the callback API as
receiving raw byte slices rather than “text frames”: replace references to “text
frames” with wording like “raw JSON byte slices (&mut [u8])” and mention that
the FeedThread callback is invoked with a mutable byte slice containing the JSON
payload; also apply the same wording change to the other doc occurrences around
lines 41-43 to avoid API confusion.

---

Nitpick comments:
In `@benchmarks/cpp/src/harness.hpp`:
- Around line 75-106: The consumer_done atomic is written by the consumer thread
but never read anywhere; remove the unused variable and its stores to simplify
the logic: delete the std::atomic<bool> consumer_done declaration and remove the
consumer_done.store(true, std::memory_order_release) call in the consumer thread
body (leave consumer_thread.join() and existing result_hist assignment intact),
ensuring no other code references consumer_done before committing.

In `@benchmarks/cpp/src/main.cpp`:
- Around line 56-60: The parsing of numeric CLI args (when match(...) and next()
set args.producer_core, args.consumer_core, args.messages, args.warmup,
args.runs) uses atoi/atoll which silently yields 0 on invalid input; replace
these with strtoul/strtoull calls, capture the endptr and errno, validate that
endptr consumed the whole token and the result is in the expected range (>0 for
counts/cores), and on error emit a clear message and exit or set a safe default.
Ensure you cast the validated unsigned long/unsigned long long to the target
types (unsigned / uint64_t) only after checking for overflow/underflow.
- Around line 91-97: The fprintf summary currently prints p50/p99/p999/max/mean
using the fprintf call that references hist.percentile(50.0),
hist.percentile(99.0), hist.percentile(99.9), hist.max(), and hist.mean();
update that fprintf invocation to include the 99.99th percentile by calling
hist.percentile(99.99) and adding a corresponding format token and label (e.g.,
add "p99.99=%lu" or rename labels to clarify p99.9 vs p99.99) so the arguments
list includes static_cast<unsigned long>(hist.percentile(99.99)) alongside the
existing percentile casts (still using run and args.runs, hist.max(),
hist.mean()).

In `@benchmarks/rust/src/queues/mantis_inline.rs`:
- Around line 69-74: The match on self.0.try_pop() uses a wildcard Err(_) which
hides non-empty errors; change it to explicitly match the queue-empty variant
(e.g., Err(<EmptyVariant>) => false) and handle any other Err(e) separately (for
example by panicking or returning/propagating the error) so unexpected errors
are surfaced. Update the match arms around the call to self.0.try_pop() to use
the concrete empty variant from the mantis_queue error enum and a separate
Err(e) arm that logs/panics/returns the error.

In `@benchmarks/scripts/prepare_system.sh`:
- Around line 87-88: The printf calls printing colored status messages use
variables directly in the format string which can break if they contain %
sequences; update the two printf invocations (the lines that currently call
printf "\n${GRN}System prepared for benchmarking.${RST}\n" and printf "Run
./check_system.sh to verify.\n") to use a safe format specifier (e.g. "%b\n" or
"%s\n") and pass the composed string or variables as arguments so escape
sequences and % chars are interpreted safely rather than treated as format
directives.
- Around line 18-21: Remove the unused RED variable definition: delete the
RED='\033[0;31m' line (symbol RED) from the script so only the used color
variables (GRN, BLD, RST) remain; confirm there are no other references to RED
elsewhere in the script and run shellcheck to verify SC2034 is resolved.

In `@benchmarks/scripts/run_bench.sh`:
- Around line 29-39: The argument parsing loop can leave flags like --messages
without values (causing silent failures under set -euo pipefail); after the
while/case block, add explicit validation for the parsed variables
(PRODUCER_CORE, CONSUMER_CORE, MESSAGES, WARMUP, RUNS) to ensure they are set
and contain valid integers (e.g., non-empty and matching /^[0-9]+$/), and call
usage (the existing usage function) with an error if any check fails; also
consider validating that no trailing single flag remained by checking "$#" or
the raw args before shifting to surface a clear error message for missing
values.

In `@crates/bench/benches/decode.rs`:
- Around line 52-65: The benchmark lacks a throughput annotation for
cross-hardware comparison; update bench_binance_decode to call
group.throughput(...) before group.bench_function, using an appropriate
Throughput unit (e.g., Throughput::Bytes with the JSON buffer size or
Throughput::Elements for per-item work) computed from binance_json() (or a
constant representing its length), and ensure the Throughput type is imported so
Criterion will include throughput in its JSON export when running the
full_pipeline benchmark.

In `@crates/fixed/src/parse.rs`:
- Around line 93-135: Remove the duplicated/orphaned doc block that describes
from_str_decimal which currently sits immediately before the parse_decimal_bytes
docs (the earlier block that starts the duplicate description); delete that
orphaned documentation so only the parse_decimal_bytes doc (and the existing
from_str_decimal doc attached to the from_str_decimal function) remain, ensuring
no duplicate or unattached docs are left around parse_decimal_bytes or
from_str_decimal.

In `@crates/polymarket/src/market/schema.rs`:
- Line 1: The crate doc comment claims "Zero-allocation serde structs", but
PolymarketBookMsg contains Vec<BookLevel<'a>> which allocates heap memory;
change the module-level doc to a more accurate phrase like "low-allocation" or
"zero-copy string serde structs" and mention that fields use borrowed & 'a str
to avoid per-field string allocations; update references to PolymarketBookMsg
and BookLevel<'a> in the comment to reflect this more precise wording.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c5611b1f-af32-4fe6-ba47-6f3f631d6f80

📥 Commits

Reviewing files that changed from the base of the PR and between d48029a and 991982e.

⛔ Files ignored due to path filters (2)

Cargo.lock is excluded by !**/*.lock
benchmarks/rust/Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (62)

.github/workflows/bench.yml
Cargo.toml
benchmarks/.gitignore
benchmarks/README.md
benchmarks/cpp/CMakeLists.txt
benchmarks/cpp/src/harness.hpp
benchmarks/cpp/src/main.cpp
benchmarks/cpp/src/message.hpp
benchmarks/cpp/src/queues/drogalis.hpp
benchmarks/cpp/src/queues/rigtorp.hpp
benchmarks/cpp/src/rdtsc.hpp
benchmarks/cpp/src/stats.hpp
benchmarks/results/.gitkeep
benchmarks/rust/Cargo.toml
benchmarks/rust/src/harness.rs
benchmarks/rust/src/main.rs
benchmarks/rust/src/message.rs
benchmarks/rust/src/queues/mantis_copy.rs
benchmarks/rust/src/queues/mantis_inline.rs
benchmarks/rust/src/queues/mod.rs
benchmarks/rust/src/queues/rtrb_queue.rs
benchmarks/rust/src/rdtsc.rs
benchmarks/rust/src/stats.rs
benchmarks/scripts/check_system.sh
benchmarks/scripts/compare.py
benchmarks/scripts/deploy_and_run.sh
benchmarks/scripts/perf_branch.sh
benchmarks/scripts/perf_cache.sh
benchmarks/scripts/perf_flamegraph.sh
benchmarks/scripts/perf_profile.sh
benchmarks/scripts/prepare_system.sh
benchmarks/scripts/run_bench.sh
benchmarks/scripts/setup_machine.sh
crates/bench/Cargo.toml
crates/bench/benches/decode.rs
crates/bench/benches/fixed.rs
crates/bench/benches/spsc.rs
crates/bench/build.rs
crates/bench/cpp/drogalis_ffi.cpp
crates/bench/cpp/rigtorp_ffi.cpp
crates/bench/src/drogalis_ffi.rs
crates/bench/src/lib.rs
crates/bench/src/messages.rs
crates/bench/src/rigtorp_ffi.rs
crates/bench/src/workloads.rs
crates/binance/Cargo.toml
crates/binance/src/decoder.rs
crates/binance/src/lib.rs
crates/binance/src/schema.rs
crates/binance/src/spawn.rs
crates/fixed/src/parse.rs
crates/polymarket/Cargo.toml
crates/polymarket/src/lib.rs
crates/polymarket/src/market/decoder.rs
crates/polymarket/src/market/mod.rs
crates/polymarket/src/market/schema.rs
crates/polymarket/src/market/spawn.rs
crates/transport/src/binance/reference.rs
crates/transport/src/feed.rs
crates/transport/src/polymarket/market.rs
crates/transport/src/ws.rs
crates/transport/tests/feed_thread.rs

💤 Files with no reviewable changes (9)

crates/bench/build.rs
crates/bench/src/lib.rs
crates/bench/src/messages.rs
crates/bench/src/rigtorp_ffi.rs
crates/bench/src/drogalis_ffi.rs
crates/bench/cpp/rigtorp_ffi.cpp
crates/bench/src/workloads.rs
crates/bench/benches/spsc.rs
crates/bench/cpp/drogalis_ffi.cpp

coderabbitai · 2026-04-07T07:08:47Z

+fn pin_to_core(core_id: usize) {
+    let core_ids = core_affinity::get_core_ids().expect("failed to get core ids");
+    let target = core_ids
+        .into_iter()
+        .find(|c| c.id == core_id)
+        .unwrap_or_else(|| panic!("core {core_id} not found"));
+    core_affinity::set_for_current(target);
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n benchmarks/rust/src/harness.rs | head -100

Repository: Milerius/Mantis

Length of output: 3634

🏁 Script executed:

rg "core_affinity" -A 3 -B 3 benchmarks/rust/

Repository: Milerius/Mantis

Length of output: 2074

🏁 Script executed:

# Check Cargo.toml to see the core_affinity version find benchmarks/rust -name "Cargo.toml" -exec cat {} \;

Repository: Milerius/Mantis

Length of output: 433

🌐 Web query:

core_affinity 0.8 set_for_current return type rust

💡 Result:

In core_affinity 0.8.x, set_for_current returns a bool:

pub fn set_for_current(core_id: CoreId) -> bool

It returns false if setting the current thread’s CPU affinity fails. [1]

Source: [1]

Use let...else for early returns and handle set_for_current failure.

Line 19 ignores the bool return value of core_affinity::set_for_current, which fails silently. If affinity setup fails in the consumer thread (line 50), the thread is not actually pinned, but continues to signal readiness at line 58, causing the producer to proceed with unreliable cross-core latency measurements. Additionally, there is no validation that producer_core and consumer_core are distinct, so identical core IDs can also produce misleading results. Rewrite pin_to_core using let...else for early returns per Rust guidelines:

Suggested pattern

fn pin_to_core(core_id: usize) -> Result<(), String> { let core_ids = core_affinity::get_core_ids() .ok_or("failed to get core ids")?; let target = core_ids .into_iter() .find(|c| c.id == core_id) .ok_or_else(|| format!("core {core_id} not found"))?; let _ = core_affinity::set_for_current(target) .then_some(()) .ok_or("failed to set core affinity")?; Ok(()) }

Then propagate errors in the consumer (line 50) and producer (line 83) call sites, and add validation that producer and consumer cores are distinct (line 27).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@benchmarks/rust/src/harness.rs` around lines 13 - 20, The pin_to_core function should be rewritten to return Result<(), String> and use let...else (or ok_or/ok_or_else) to early-return on failures when calling core_affinity::get_core_ids and finding the target core, and must check the bool result of core_affinity::set_for_current and return an Err if it fails; update callers (where consumer and producer threads call pin_to_core) to propagate or handle the Result (do not silently ignore failures), and add a validation check where producer_core and consumer_core are parsed/used to ensure they are distinct and return an error if they are equal so cross-core measurements are not run with the same core IDs.

coderabbitai · 2026-04-07T07:08:48Z

+mkdir -p "$BENCH_DIR/results"
+rsync -avz \
+    "$REMOTE_HOST:$REMOTE_DIR/results/" \
+    "$BENCH_DIR/results/"


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n benchmarks/scripts/deploy_and_run.sh | head -100

Repository: Milerius/Mantis

Length of output: 3324

🏁 Script executed:

cat -n benchmarks/scripts/run_bench.sh | head -80

Repository: Milerius/Mantis

Length of output: 3259

🏁 Script executed:

cat -n benchmarks/scripts/run_bench.sh | tail -50

Repository: Milerius/Mantis

Length of output: 2170

🏁 Script executed:

find benchmarks -name "*.rs" -type f | head -20

Repository: Milerius/Mantis

Length of output: 367

🏁 Script executed:

grep -n "output" benchmarks/rust/src/main.rs | head -20

Repository: Milerius/Mantis

Length of output: 336

🏁 Script executed:

grep -n "output" benchmarks/cpp/src/main.cpp | head -20

Repository: Milerius/Mantis

Length of output: 362

🏁 Script executed:

sed -n '100,120p' benchmarks/cpp/src/main.cpp

Repository: Milerius/Mantis

Length of output: 716

🏁 Script executed:

cat -n benchmarks/scripts/compare.py | head -50

Repository: Milerius/Mantis

Length of output: 1913

Add host-specific namespace to prevent result overwrites across multiple hosts.

Lines 70–73 sync all remote results to the same local benchmarks/results/ directory. Since the Rust and C++ benchmark binaries generate fixed filenames like rust_mantis_copy_run_1.json and cpp_rigtorp_run_1.json (based only on queue/implementation name and run number), running this script sequentially against multiple hosts causes later hosts to overwrite earlier hosts' results. Data loss occurs because filenames contain no host identifier.

Sync into host-specific subdirectories (e.g., results/$REMOTE_HOST/) or include the hostname in artifact filenames before pulling results back.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@benchmarks/scripts/deploy_and_run.sh` around lines 70 - 73, The current rsync pulls all remote results into a single directory and causes overwrites; update the target path to include a host-specific namespace by creating and syncing into a host subdirectory (use BENCH_DIR, REMOTE_HOST and REMOTE_DIR from the script) — e.g., ensure mkdir -p "$BENCH_DIR/results/$REMOTE_HOST" and change the rsync destination to "$BENCH_DIR/results/$REMOTE_HOST/" (alternatively, prepend REMOTE_HOST to artifact filenames on the remote side before pulling) so each host’s results are isolated.

Binance: 5 new tests covering invalid price/qty parse failures. Polymarket: 8 new tests covering invalid side, price, size, empty book levels, skipped invalid levels, and default trade aggressor. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

crates/polymarket/src/market/decoder.rs (1)

170-245: ⚠️ Potential issue | 🟠 Major

LAST_IN_BATCH incorrectly computed from raw level count, not emitted count.

When parse_price_qty fails (line 182/216 continue) or the buffer fills at 64, the is_last check at lines 185/219 uses the raw total from JSON, not the actual number of successfully emitted events. This can result in no event having LAST_IN_BATCH set.

Example: bids = [invalid, valid], asks = [] → total = 2, but only 1 event emits. At count = 0, is_last = (0 + 1 == 2) = false, so the only emitted event lacks LAST_IN_BATCH.

Proposed fix: set LAST_IN_BATCH post-loop

         for (depth_idx, level) in msg.bids.iter().enumerate() {
             if count >= 64 {
                 break;
             }
             let Some((price, qty)) = parse_price_qty::<D>(level.price, level.size, &meta) else {
                 continue;
             };
 
-            let is_last = count + 1 == total;
-            let flags = if is_last {
-                EventFlags::IS_SNAPSHOT | EventFlags::LAST_IN_BATCH
-            } else {
-                EventFlags::IS_SNAPSHOT
-            };
+            let flags = EventFlags::IS_SNAPSHOT;
 
             self.seq += 1;
             out[count] = HotEvent::book_delta(
                 recv_ts,
                 SeqNum::from_raw(self.seq),
                 instrument_id,
                 self.source_id,
                 flags,
                 BookDeltaPayload {
                     price,
                     qty,
                     side: Side::Bid,
                     action: UpdateAction::New,
                     depth: depth_idx as u8,
                     _pad: [0; 5],
                 },
             );
             count += 1;
         }
 
         for (depth_idx, level) in msg.asks.iter().enumerate() {
             if count >= 64 {
                 break;
             }
             let Some((price, qty)) = parse_price_qty::<D>(level.price, level.size, &meta) else {
                 continue;
             };
 
-            let is_last = count + 1 == total;
-            let flags = if is_last {
-                EventFlags::IS_SNAPSHOT | EventFlags::LAST_IN_BATCH
-            } else {
-                EventFlags::IS_SNAPSHOT
-            };
+            let flags = EventFlags::IS_SNAPSHOT;
 
             self.seq += 1;
             out[count] = HotEvent::book_delta(
                 recv_ts,
                 SeqNum::from_raw(self.seq),
                 instrument_id,
                 self.source_id,
                 flags,
                 BookDeltaPayload {
                     price,
                     qty,
                     side: Side::Ask,
                     action: UpdateAction::New,
                     depth: depth_idx as u8,
                     _pad: [0; 5],
                 },
             );
             count += 1;
         }
 
+        if count > 0 {
+            out[count - 1].header.flags |= EventFlags::LAST_IN_BATCH;
+        }
+
         count

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/polymarket/src/market/decoder.rs` around lines 170 - 245, The
LAST_IN_BATCH flag is computed against the raw JSON total but some levels are
skipped (parse_price_qty) or truncated by the 64-entry limit, so update logic to
set LAST_IN_BATCH based on the actual emitted count: keep emitting as you do in
the bids/asks loops (functions/variables: parse_price_qty, out, count,
HotEvent::book_delta, self.seq, EventFlags::LAST_IN_BATCH), but remove/ignore
the is_last check inside each loop and after both loops, if count > 0 set the
LAST_IN_BATCH bit on the final emitted event in out[count-1] (preserving any
existing IS_SNAPSHOT flag) so the last successfully emitted event gets
LAST_IN_BATCH.

🧹 Nitpick comments (4)

.github/scripts/bench-report.sh (1)

8-13: Keep the static usage comment in sync with the new argument contract.

Runtime usage here is updated, but the header comment still documents the old 2-arg form. Updating that comment will prevent confusion when invoking the script manually.

Proposed doc-only fix

-# Usage: bench-report.sh <linux-json> <macos-json>
+# Usage: bench-report.sh <linux-seqlock> <macos-seqlock> [<linux-fixed> <macos-fixed>] [<linux-market-state> <macos-market-state>]

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.github/scripts/bench-report.sh around lines 8 - 13, Update the script
header usage comment to match the new positional-argument contract used by the
variable assignments (linux_seqlock, macos_seqlock, linux_fixed, macos_fixed,
linux_market_state, macos_market_state): replace the old 2-argument usage line
with a single-line usage that documents required first two args and the optional
third/fourth and fifth/sixth pairs (e.g. bench-report.sh <linux-seqlock>
<macos-seqlock> [<linux-fixed> <macos-fixed>] [<linux-market-state>
<macos-market-state>]) so the top-of-file static comment is consistent with how
the script actually parses arguments.

crates/polymarket/src/market/decoder.rs (3)

291-294: Misleading SAFETY comment on safe code.

Line 292's // SAFETY: comment implies an unsafe block, but core::str::from_utf8 is a safe function. Per coding guidelines, // SAFETY: comments should only accompany unsafe blocks. Consider rewording to a regular comment explaining the ASCII assumption.

Suggested rewording

-    // SAFETY: JSON values in the "type" field are always ASCII
+    // Note: JSON "type" field values are ASCII, so UTF-8 conversion is infallible
     core::str::from_utf8(remaining.get(..end)?).ok()

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/polymarket/src/market/decoder.rs` around lines 291 - 294, The comment
preceding the call to core::str::from_utf8 in decoder.rs incorrectly uses a "//
SAFETY:" marker (implying an unsafe block) even though from_utf8 is safe;
replace or reword that comment to a normal explanatory comment (e.g., "// JSON
values in the 'type' field are always ASCII") adjacent to the
core::str::from_utf8(remaining.get(..end)?).ok() call so it documents the ASCII
assumption without using a SAFETY label.

702-723: Test should verify LAST_IN_BATCH flag on the emitted event.

This test validates that invalid levels are skipped, but doesn't check that the single emitted event has LAST_IN_BATCH set. Adding this assertion would expose the LAST_IN_BATCH computation bug flagged earlier.

Suggested assertion

         // Only the valid level should be emitted
         assert_eq!(n, 1);
+        
+        // The sole emitted event should have LAST_IN_BATCH
+        assert!(out[0].header.flags.contains(EventFlags::LAST_IN_BATCH));
 
         if let EventBody::BookDelta(p) = out[0].body {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/polymarket/src/market/decoder.rs` around lines 702 - 723, Update the
test decode_book_skips_invalid_price_level to also assert the LAST_IN_BATCH flag
is set on the single emitted event: after decoding and before inspecting
EventBody, check the event's flags (e.g., out[0].flags or out[0].meta.flags)
includes EventFlags::LAST_IN_BATCH (or the equivalent bit/constant) so the test
verifies PolymarketMarketDecoder::decode correctly marks the only emitted
BookDelta as last in batch when make_out produces a single slot.

296-318: Consider using memchr crate for hot-path byte scanning.

Since peek_type is documented as a "hot-path optimization", the linear scans in memchr_find and memchr_byte could benefit from the SIMD-accelerated memchr crate. This is optional since the current implementation is correct.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/polymarket/src/market/decoder.rs` around lines 296 - 318, Replace the
manual linear scans in memchr_find and memchr_byte with the SIMD-accelerated
routines from the memchr crate: add memchr = "..." to Cargo.toml, change
memchr_byte to call memchr::memchr(byte, haystack) and change memchr_find to use
memchr::memmem::Finder::new(needle).find(haystack) (or memchr::memmem::find) to
locate the subslice; keep the same Option<usize> return semantics so callers
such as peek_type continue to work unchanged.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/polymarket/src/market/decoder.rs`:
- Around line 690-700: The test function decode_book_empty_levels_returns_zero
contains a formatting issue around the raw byte literal
br#"{"type":"book","asset_id":"abc123","bids":[],"asks":[]}"#.to_vec() which
triggers rustfmt; run cargo +nightly fmt --all to auto-fix or manually reformat
the literal and surrounding lines so they follow rustfmt rules (ensure spacing
and line breaks around the br! literal and the to_vec() call are normalized) and
re-run tests; check the PolymarketMarketDecoder::<6>::new and decoder.decode
usages remain unchanged after formatting.

---

Duplicate comments:
In `@crates/polymarket/src/market/decoder.rs`:
- Around line 170-245: The LAST_IN_BATCH flag is computed against the raw JSON
total but some levels are skipped (parse_price_qty) or truncated by the 64-entry
limit, so update logic to set LAST_IN_BATCH based on the actual emitted count:
keep emitting as you do in the bids/asks loops (functions/variables:
parse_price_qty, out, count, HotEvent::book_delta, self.seq,
EventFlags::LAST_IN_BATCH), but remove/ignore the is_last check inside each loop
and after both loops, if count > 0 set the LAST_IN_BATCH bit on the final
emitted event in out[count-1] (preserving any existing IS_SNAPSHOT flag) so the
last successfully emitted event gets LAST_IN_BATCH.

---

Nitpick comments:
In @.github/scripts/bench-report.sh:
- Around line 8-13: Update the script header usage comment to match the new
positional-argument contract used by the variable assignments (linux_seqlock,
macos_seqlock, linux_fixed, macos_fixed, linux_market_state,
macos_market_state): replace the old 2-argument usage line with a single-line
usage that documents required first two args and the optional third/fourth and
fifth/sixth pairs (e.g. bench-report.sh <linux-seqlock> <macos-seqlock>
[<linux-fixed> <macos-fixed>] [<linux-market-state> <macos-market-state>]) so
the top-of-file static comment is consistent with how the script actually parses
arguments.

In `@crates/polymarket/src/market/decoder.rs`:
- Around line 291-294: The comment preceding the call to core::str::from_utf8 in
decoder.rs incorrectly uses a "// SAFETY:" marker (implying an unsafe block)
even though from_utf8 is safe; replace or reword that comment to a normal
explanatory comment (e.g., "// JSON values in the 'type' field are always
ASCII") adjacent to the core::str::from_utf8(remaining.get(..end)?).ok() call so
it documents the ASCII assumption without using a SAFETY label.
- Around line 702-723: Update the test decode_book_skips_invalid_price_level to
also assert the LAST_IN_BATCH flag is set on the single emitted event: after
decoding and before inspecting EventBody, check the event's flags (e.g.,
out[0].flags or out[0].meta.flags) includes EventFlags::LAST_IN_BATCH (or the
equivalent bit/constant) so the test verifies PolymarketMarketDecoder::decode
correctly marks the only emitted BookDelta as last in batch when make_out
produces a single slot.
- Around line 296-318: Replace the manual linear scans in memchr_find and
memchr_byte with the SIMD-accelerated routines from the memchr crate: add memchr
= "..." to Cargo.toml, change memchr_byte to call memchr::memchr(byte, haystack)
and change memchr_find to use memchr::memmem::Finder::new(needle).find(haystack)
(or memchr::memmem::find) to locate the subslice; keep the same Option<usize>
return semantics so callers such as peek_type continue to work unchanged.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0148a23f-2ec0-41c9-bd57-fb841d81e8b1

📥 Commits

Reviewing files that changed from the base of the PR and between 991982e and 9ca2927.

⛔ Files ignored due to path filters (1)

Cargo.lock is excluded by !**/*.lock

📒 Files selected for processing (7)

.github/scripts/bench-report.sh
.github/scripts/criterion-to-json.sh
.github/workflows/bench.yml
crates/binance/src/decoder.rs
crates/polymarket/src/market/decoder.rs
crates/transport/Cargo.toml
crates/transport/tests/live_feeds.rs

✅ Files skipped from review due to trivial changes (3)

.github/scripts/criterion-to-json.sh
.github/workflows/bench.yml
crates/transport/Cargo.toml

🚧 Files skipped from review as they are similar to previous changes (1)

crates/binance/src/decoder.rs

- Add missing <algorithm> include for std::min in C++ stats.hpp - Add // SAFETY: comment to rdtsc unsafe block per project policy - Assert core_affinity::set_for_current return value in harness - Validate producer and consumer cores are distinct - Fix README perf script examples (missing benchmark binary path) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

crates/polymarket/src/market/decoder.rs (2)
701-722: Test doesn't assert on depth value.

This test verifies that invalid levels are skipped, but doesn't check the depth field of the emitted event. Adding assert_eq!(p.depth, ...) would clarify the expected semantic (should it be 0 or 1 when the first level is skipped?) and catch regressions if the depth logic changes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/polymarket/src/market/decoder.rs` around lines 701 - 722, In the test
decode_book_skips_invalid_price_level, add an assertion that the emitted
BookDelta's depth is the expected value after the first (invalid) level is
skipped — e.g., assert that p.depth equals Depth::from_raw(0) (or the equivalent
zero-depth value used in your code) right after the existing assertions on
p.price, p.qty, and p.side so the test verifies depth semantics for
PolymarketMarketDecoder::<6>::new and EventBody::BookDelta.
296-318: Consider memchr crate for hot-path byte searches.

These naive O(n) / O(n*m) search implementations are called on every incoming message. The memchr crate provides SIMD-accelerated alternatives (memchr::memmem::find and memchr::memchr) that would improve throughput on this hot path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/polymarket/src/market/decoder.rs` around lines 296 - 318, The custom
byte/slice search helpers memchr_find and memchr_byte are used on the hot path
and should be replaced with SIMD-accelerated functions from the memchr crate to
improve throughput: swap memchr_find's implementation to call
memchr::memmem::find(haystack, needle) and memchr_byte to call
memchr::memchr(byte, haystack) (both return Option<usize> so adapt directly),
add memchr to Cargo.toml dependencies, and update the use/imports (e.g., use
memchr::memmem and memchr::memchr) so tests and callers of
memchr_find/memchr_byte keep the same signature and behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/polymarket/src/market/decoder.rs`:
- Around line 177-209: The emitted BookDeltaPayload.depth currently uses the
enumerate index (depth_idx) which advances even when parse_price_qty::<D>
returns None, causing holes; fix by introducing a per-side emitted counter
(e.g., let mut bid_depth = 0u8) inside the bids loop and set
BookDeltaPayload.depth = bid_depth, incrementing bid_depth only when you
successfully parse and push an event; apply the same pattern (e.g., ask_depth)
to the asks loop and ensure you stop at count >= 64 as before and still
increment self.seq and count only when emitting an event.

---

Nitpick comments:
In `@crates/polymarket/src/market/decoder.rs`:
- Around line 701-722: In the test decode_book_skips_invalid_price_level, add an
assertion that the emitted BookDelta's depth is the expected value after the
first (invalid) level is skipped — e.g., assert that p.depth equals
Depth::from_raw(0) (or the equivalent zero-depth value used in your code) right
after the existing assertions on p.price, p.qty, and p.side so the test verifies
depth semantics for PolymarketMarketDecoder::<6>::new and EventBody::BookDelta.
- Around line 296-318: The custom byte/slice search helpers memchr_find and
memchr_byte are used on the hot path and should be replaced with
SIMD-accelerated functions from the memchr crate to improve throughput: swap
memchr_find's implementation to call memchr::memmem::find(haystack, needle) and
memchr_byte to call memchr::memchr(byte, haystack) (both return Option<usize> so
adapt directly), add memchr to Cargo.toml dependencies, and update the
use/imports (e.g., use memchr::memmem and memchr::memchr) so tests and callers
of memchr_find/memchr_byte keep the same signature and behavior.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3786a78b-5af8-41bc-a67c-e4e27ae84f31

📥 Commits

Reviewing files that changed from the base of the PR and between 9ca2927 and 4de2cea.

📒 Files selected for processing (1)

crates/polymarket/src/market/decoder.rs

coderabbitai · 2026-04-07T07:24:29Z

+        for (depth_idx, level) in msg.bids.iter().enumerate() {
+            if count >= 64 {
+                break;
+            }
+            let Some((price, qty)) = parse_price_qty::<D>(level.price, level.size, &meta) else {
+                continue;
+            };
+
+            let is_last = count + 1 == total;
+            let flags = if is_last {
+                EventFlags::IS_SNAPSHOT | EventFlags::LAST_IN_BATCH
+            } else {
+                EventFlags::IS_SNAPSHOT
+            };
+
+            self.seq += 1;
+            out[count] = HotEvent::book_delta(
+                recv_ts,
+                SeqNum::from_raw(self.seq),
+                instrument_id,
+                self.source_id,
+                flags,
+                BookDeltaPayload {
+                    price,
+                    qty,
+                    side: Side::Bid,
+                    action: UpdateAction::New,
+                    depth: depth_idx as u8,
+                    _pad: [0; 5],
+                },
+            );
+            count += 1;
+        }


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

find . -name "decoder.rs" | head -5

Repository: Milerius/Mantis

Length of output: 132

🏁 Script executed:

cd crates/polymarket && pwd && head -300 src/market/decoder.rs | tail -150

Repository: Milerius/Mantis

Length of output: 4799

🏁 Script executed:

cat -n crates/polymarket/src/market/decoder.rs | sed -n '150,250p'

Repository: Milerius/Mantis

Length of output: 3820

🏁 Script executed:

cat -n crates/polymarket/src/market/decoder.rs | sed -n '210,280p'

Repository: Milerius/Mantis

Length of output: 2796

🏁 Script executed:

rg -n "depth:" crates/polymarket/src/ -A 2 -B 2

Repository: Milerius/Mantis

Length of output: 1225

🏁 Script executed:

rg -n "struct BookDeltaPayload\|depth:" crates/ -A 5 -B 2

Repository: Milerius/Mantis

Length of output: 41

🏁 Script executed:

rg -n "depth.*venue\|venue.*depth\|depth.*index\|depth.*position" crates/polymarket/ -i

Repository: Milerius/Mantis

Length of output: 41

🏁 Script executed:

find . -name "*decoder*test*" -o -name "*test*decoder*" 2>/dev/null

Repository: Milerius/Mantis

Length of output: 41

🏁 Script executed:

find . -path "*/test*" -name "*.rs" | xargs rg -l "decoder\|BookDelta" 2>/dev/null | head -5

Repository: Milerius/Mantis

Length of output: 41

🏁 Script executed:

cat -n crates/polymarket/src/market/decoder.rs | sed -n '150,300p'

Repository: Milerius/Mantis

Length of output: 5820

🏁 Script executed:

rg -n "depth" crates/polymarket/src/market/ -B 3 -A 3

Repository: Milerius/Mantis

Length of output: 4824

🏁 Script executed:

rg -n "BookDeltaPayload" crates/ -B 5 -A 10 | head -100

Repository: Milerius/Mantis

Length of output: 7370

🏁 Script executed:

find . -name "*.md" -o -name "*.txt" | xargs rg -l "depth" 2>/dev/null | head -3

Repository: Milerius/Mantis

Length of output: 134

🏁 Script executed:

cat -n crates/polymarket/src/market/decoder.rs | sed -n '420,500p'

Repository: Milerius/Mantis

Length of output: 3761

🏁 Script executed:

rg -n "parse_price_qty\|Invalid|fail" crates/polymarket/src/market/decoder.rs -B 2 -A 2

Repository: Milerius/Mantis

Length of output: 285

depth field represents input array index, not emitted event position within each side.

When parse_price_qty fails, continue is used but the enumerate index still advances. If bids[0] fails and bids[1] succeeds, the emitted event has depth: 1 instead of the correct depth: 0. The same issue applies to asks.

Use a separate counter per side to track emitted position:

Fix

- for (depth_idx, level) in msg.bids.iter().enumerate() { + let mut bid_depth: u8 = 0; + for level in &msg.bids { if count >= 64 { break; } let Some((price, qty)) = parse_price_qty::<D>(level.price, level.size, &meta) else { continue; }; // ... flags logic ... self.seq += 1; out[count] = HotEvent::book_delta( // ... BookDeltaPayload { price, qty, side: Side::Bid, action: UpdateAction::New, - depth: depth_idx as u8, + depth: bid_depth, _pad: [0; 5], }, ); + bid_depth += 1; count += 1; }

Apply the same pattern to the asks loop (lines 211+).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

for (depth_idx, level) in msg.bids.iter().enumerate() {

if count >= 64 {

break;

}

let Some((price, qty)) = parse_price_qty::<D>(level.price, level.size, &meta) else {

continue;

};

let is_last = count + 1 == total;

let flags = if is_last {

EventFlags::IS_SNAPSHOT | EventFlags::LAST_IN_BATCH

} else {

EventFlags::IS_SNAPSHOT

};

self.seq += 1;

out[count] = HotEvent::book_delta(

recv_ts,

SeqNum::from_raw(self.seq),

instrument_id,

self.source_id,

flags,

BookDeltaPayload {

price,

qty,

side: Side::Bid,

action: UpdateAction::New,

depth: depth_idx as u8,

_pad: [0; 5],

},

);

count += 1;

}

let mut bid_depth: u8 = 0;

for level in &msg.bids {

if count >= 64 {

break;

}

let Some((price, qty)) = parse_price_qty::<D>(level.price, level.size, &meta) else {

continue;

};

let is_last = count + 1 == total;

let flags = if is_last {

EventFlags::IS_SNAPSHOT | EventFlags::LAST_IN_BATCH

} else {

EventFlags::IS_SNAPSHOT

};

self.seq += 1;

out[count] = HotEvent::book_delta(

recv_ts,

SeqNum::from_raw(self.seq),

instrument_id,

self.source_id,

flags,

BookDeltaPayload {

price,

qty,

side: Side::Bid,

action: UpdateAction::New,

depth: bid_depth,

_pad: [0; 5],

},

);

bid_depth += 1;

count += 1;

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/polymarket/src/market/decoder.rs` around lines 177 - 209, The emitted BookDeltaPayload.depth currently uses the enumerate index (depth_idx) which advances even when parse_price_qty::<D> returns None, causing holes; fix by introducing a per-side emitted counter (e.g., let mut bid_depth = 0u8) inside the bids loop and set BookDeltaPayload.depth = bid_depth, incrementing bid_depth only when you successfully parse and push an event; apply the same pattern (e.g., ask_depth) to the asks loop and ensure you stop at count >= 64 as before and still increment self.seq and count only when emitting an event.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

benchmarks/cpp/src/stats.hpp (1)
33-40: ⚠️ Potential issue | 🟡 Minor

Use the inclusive overflow-bucket ceiling.

percentile() reports overflow_upper_bound() directly, but Line 39 currently returns the exclusive end of the documented range. Bucket 0 is described as 4096..8191, yet this code reports 8192, so any percentile that lands in overflow_ is overstated by one cycle.
🩹 Proposed fix
 inline uint64_t overflow_upper_bound(size_t i) {
     if (i >= LOG_BUCKETS - 1) {
         return std::numeric_limits<uint64_t>::max();
     }
     // bucket i covers [4096 << i, (4096 << (i+1)) - 1]
-    return static_cast<uint64_t>(DIRECT_BUCKETS) << (i + 1);
+    return (static_cast<uint64_t>(DIRECT_BUCKETS) << (i + 1)) - 1;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@benchmarks/cpp/src/stats.hpp` around lines 33 - 40, The function
overflow_upper_bound(size_t i) currently returns the exclusive end of the
documented bucket range (e.g., returns 8192 for bucket 0) which inflates
percentiles; change its non-max branch to return the inclusive upper bound by
subtracting one from the shifted value (i.e., use
(static_cast<uint64_t>(DIRECT_BUCKETS) << (i + 1)) - 1) while keeping the
existing numeric_limits<uint64_t>::max() for the final overflow bucket.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@benchmarks/cpp/src/stats.hpp`:
- Around line 314-317: Replace the locale-dependent snprintf usage that formats
hist.mean() (the mean_buf/snprintf sequence and the os << "   
\"cycles_per_op_mean\"" emission) with a locale-independent formatter: include
<iomanip> and <locale>, create a std::ostringstream, imbue it with
std::locale::classic(), use std::fixed and std::setprecision(1) to stream
hist.mean() into the ostringstream, then write the resulting string to os for
the "cycles_per_op_mean" field; this ensures JSON uses a dot decimal separator
regardless of LC_NUMERIC.

In `@benchmarks/README.md`:
- Around line 13-20: The README currently claims using `lfence; rdtsc` and
same-CCD + `isolcpus` is sufficient, but it omits that synchronized/invariant
TSC is a hard prerequisite because `lfence` only serializes `rdtsc` locally and
does not eliminate cross-core TSC skew; update the benchmark docs to explicitly
require invariant/synchronized TSC (or mention using a kernel TSC sync feature)
and warn that producer/consumer timestamp deltas (`producer_rdtsc`,
`consumer_rdtsc`) can wrap or produce huge outliers on machines without it; keep
the existing same-CCD/isolation guidance but add a clear note adjacent to the
`lfence; rdtsc` and same-CCD/isolcpus lines stating this hard requirement and
recommended ways to verify or enable synchronized TSC.

---

Duplicate comments:
In `@benchmarks/cpp/src/stats.hpp`:
- Around line 33-40: The function overflow_upper_bound(size_t i) currently
returns the exclusive end of the documented bucket range (e.g., returns 8192 for
bucket 0) which inflates percentiles; change its non-max branch to return the
inclusive upper bound by subtracting one from the shifted value (i.e., use
(static_cast<uint64_t>(DIRECT_BUCKETS) << (i + 1)) - 1) while keeping the
existing numeric_limits<uint64_t>::max() for the final overflow bucket.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8a13487a-888e-4bfd-af46-b0cf048f0ad3

📥 Commits

Reviewing files that changed from the base of the PR and between 4de2cea and 23500a2.

📒 Files selected for processing (4)

benchmarks/README.md
benchmarks/cpp/src/stats.hpp
benchmarks/rust/src/harness.rs
benchmarks/rust/src/rdtsc.rs

✅ Files skipped from review due to trivial changes (1)

benchmarks/rust/src/rdtsc.rs

🚧 Files skipped from review as they are similar to previous changes (1)

benchmarks/rust/src/harness.rs

coderabbitai · 2026-04-07T07:35:55Z

+    // Format mean with one decimal place
+    char mean_buf[64];
+    std::snprintf(mean_buf, sizeof(mean_buf), "%.1f", hist.mean());
+    os << "    \"cycles_per_op_mean\": " << mean_buf << ",\n";


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -type f -name "stats.hpp" | head -20

Repository: Milerius/Mantis

Length of output: 89

🏁 Script executed:

if [ -f "benchmarks/cpp/src/stats.hpp" ]; then wc -l "benchmarks/cpp/src/stats.hpp" fi

Repository: Milerius/Mantis

Length of output: 91

🏁 Script executed:

cat -n "benchmarks/cpp/src/stats.hpp" | sed -n '310,320p'

Repository: Milerius/Mantis

Length of output: 621

🏁 Script executed:

head -50 "benchmarks/cpp/src/stats.hpp"

Repository: Milerius/Mantis

Length of output: 1651

🏁 Script executed:

grep -n "#include" "benchmarks/cpp/src/stats.hpp"

Repository: Milerius/Mantis

Length of output: 244

🏁 Script executed:

grep -n "snprintf\|sprintf" "benchmarks/cpp/src/stats.hpp"

Repository: Milerius/Mantis

Length of output: 130

Serialize cycles_per_op_mean with a locale-independent decimal point.

Line 316 uses std::snprintf("%.1f", ...), which follows LC_NUMERIC. On hosts with a comma decimal separator, this emits 12,3 and breaks JSON validity for the cross-hardware comparison tooling. Use std::ostringstream with std::locale::classic() to force the classic locale.

Add these includes at the top of the file:

`#include` <iomanip> `#include` <locale>

Then update the code:

🩹 Proposed fix

- // Format mean with one decimal place - char mean_buf[64]; - std::snprintf(mean_buf, sizeof(mean_buf), "%.1f", hist.mean()); - os << " \"cycles_per_op_mean\": " << mean_buf << ",\n"; + // Format mean with one decimal place using a locale-independent decimal point + std::ostringstream mean_os; + mean_os.imbue(std::locale::classic()); + mean_os << std::fixed << std::setprecision(1) << hist.mean(); + os << " \"cycles_per_op_mean\": " << mean_os.str() << ",\n";

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@benchmarks/cpp/src/stats.hpp` around lines 314 - 317, Replace the locale-dependent snprintf usage that formats hist.mean() (the mean_buf/snprintf sequence and the os << " \"cycles_per_op_mean\"" emission) with a locale-independent formatter: include <iomanip> and <locale>, create a std::ostringstream, imbue it with std::locale::classic(), use std::fixed and std::setprecision(1) to stream hist.mean() into the ostringstream, then write the resulting string to os for the "cycles_per_op_mean" field; this ensures JSON uses a dot decimal separator regardless of LC_NUMERIC.

- Fix isolcpus parsing: use dedicated cmdline param parser instead of read_proc_field which expects colon-separated fields - Fix LAST_IN_BATCH: set flag on last emitted event after both loops, not based on raw JSON level count (which ignores skipped/truncated levels) - Add git/curl to setup_machine.sh bootstrap packages Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 3

♻️ Duplicate comments (1)

crates/polymarket/src/market/decoder.rs (1)

176-201: ⚠️ Potential issue | 🟠 Major

BookDeltaPayload.depth is incorrect when invalid levels are skipped.

At Line 176 and Line 203, depth is derived from enumerate() index. Because Line 180 and Line 207 can continue, emitted depths can have holes (e.g., first valid bid emitted with depth 1). This breaks the documented “0 = best level” contract.

Suggested fix

-        for (depth_idx, level) in msg.bids.iter().enumerate() {
+        let mut bid_depth: u8 = 0;
+        for level in &msg.bids {
             if count >= 64 {
                 break;
             }
             let Some((price, qty)) = parse_price_qty::<D>(level.price, level.size, &meta) else {
                 continue;
             };

             self.seq += 1;
             out[count] = HotEvent::book_delta(
                 recv_ts,
                 SeqNum::from_raw(self.seq),
                 instrument_id,
                 self.source_id,
                 EventFlags::IS_SNAPSHOT,
                 BookDeltaPayload {
                     price,
                     qty,
                     side: Side::Bid,
                     action: UpdateAction::New,
-                    depth: depth_idx as u8,
+                    depth: bid_depth,
                     _pad: [0; 5],
                 },
             );
+            bid_depth = bid_depth.saturating_add(1);
             count += 1;
         }

-        for (depth_idx, level) in msg.asks.iter().enumerate() {
+        let mut ask_depth: u8 = 0;
+        for level in &msg.asks {
             if count >= 64 {
                 break;
             }
             let Some((price, qty)) = parse_price_qty::<D>(level.price, level.size, &meta) else {
                 continue;
             };

             self.seq += 1;
             out[count] = HotEvent::book_delta(
                 recv_ts,
                 SeqNum::from_raw(self.seq),
                 instrument_id,
                 self.source_id,
                 EventFlags::IS_SNAPSHOT,
                 BookDeltaPayload {
                     price,
                     qty,
                     side: Side::Ask,
                     action: UpdateAction::New,
-                    depth: depth_idx as u8,
+                    depth: ask_depth,
                     _pad: [0; 5],
                 },
             );
+            ask_depth = ask_depth.saturating_add(1);
             count += 1;
         }

Also applies to: 203-228

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/polymarket/src/market/decoder.rs` around lines 176 - 201, The emitted
BookDeltaPayload.depth currently uses the enumerate() index (depth_idx) which
becomes incorrect when parse_price_qty(...) returns None and the loop continues;
change each bid and ask loop to maintain a separate depth counter (e.g.,
bid_depth / ask_depth initialized to 0) that is only incremented when you
successfully parse and emit a level so depths are contiguous (0 = best level)
and assign that counter to BookDeltaPayload.depth; apply the same fix to both
the bids loop (for (depth_idx, level) in msg.bids.iter().enumerate()) and the
asks loop so skipped/invalid levels do not create depth holes.

🧹 Nitpick comments (2)

crates/polymarket/src/market/decoder.rs (1)
61-66: Avoid wildcard match arms to comply with repo rule.

Both dispatch matches use _ => .... Please rewrite with explicit branching to satisfy the no-wildcard rule.

As per coding guidelines, No wildcard matches — use explicit destructuring in pattern matching.

Also applies to: 248-252
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/polymarket/src/market/decoder.rs` around lines 61 - 66, Replace the
wildcard arms in the message-type dispatches with explicit branches: in the
match over msg_type (the block that currently has "price_change",
"last_trade_price", "book" and `_ => 0`) change the `_` arm to a named binding
(e.g., other) and handle it explicitly by logging/handling the unrecognized
msg_type and returning 0; do the same for the other dispatch at the same file
(the match around lines 248-252). Specifically update the match in the decoder
(the function that calls self.decode_price_change / self.decode_trade /
self.decode_book) to replace `_ => 0` with an explicit arm like other => { /*
log unrecognized message type using other */ 0 } so there are no wildcard `_`
matches while preserving existing behavior.
benchmarks/rust/src/stats.rs (1)
379-395: Lock the overflow-bucket contract with an exact assertion.

This only proves that p99.99 lands somewhere above the direct range. It still passes if a later cleanup changes the intentional exclusive-boundary behavior and Rust/C++ quietly drift. For this fixture, asserting h.percentile(99.99) == 131072 (or checking overflow_upper_bound() directly) would make that contract explicit.

Based on learnings, overflow_upper_bound(i) intentionally returns the exclusive top of each log bucket rather than an inclusive upper bound.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@benchmarks/rust/src/stats.rs` around lines 379 - 395, The test's assertion is
too weak; replace the loose range check with an exact contract check so overflow
behavior can't drift. In the histogram_overflow_buckets test, assert that
CycleHistogram::percentile(99.99) equals the expected exclusive overflow upper
bound (use assert_eq!(p9999, 131072)) or alternatively call
h.overflow_upper_bound(...) and assert that it returns 131072; update the
assertion replacing the current assert!(p9999 > 4095, ...) to an exact equality
assertion referencing percentile (and/or overflow_upper_bound) to lock the
overflow-bucket contract.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@benchmarks/rust/src/stats.rs`:
- Line 181: The serialized `compiler` field is calling rustc_version() at
runtime (e.g., the `compiler: rustc_version()` assignment in stats.rs), which
reports the runner's toolchain; instead capture the compiler at build time via a
build script and read that value at runtime. Add a build.rs that queries the
compiler (e.g., using the rustc_version crate or `rustc --version`) and emits it
with println!("cargo:rustc-env=BUILD_RUSTC_VERSION={}") and then replace calls
to rustc_version() in stats.rs (and the other spots around the 212-219 block) to
read the env var (e.g., env!("BUILD_RUSTC_VERSION") or std::env::var at runtime)
so the JSON contains the toolchain used to build the binary rather than the
runner's rustc.
- Around line 19-20: The mean calculation can be corrupted by wrapped TSC deltas
(near u64::MAX); modify the stats update path (the method that processes a new
cycles value—e.g., add() or the code that does self.sum += cycles) to detect and
ignore suspiciously large deltas before mutating sum/count (for example: if
cycles > (u64::MAX / 2) or another large threshold treat it as a wrapped/invalid
sample and return early), and keep min/max/count updates unchanged if you
prefer; also ensure mean() uses checked arithmetic (checked_add/checked_div or
using u128 accumulation) to avoid panics/wraps when computing self.sum /
self.count using the struct fields sum, min, max, count.

In `@crates/polymarket/src/market/decoder.rs`:
- Around line 272-284: peek_type currently only matches the exact bytes
"\"type\":\"" and so misses valid JSON with whitespace; update peek_type to
first locate the key bytes b"\"type\"" (using memchr_find), then advance past
that key, skip any ASCII whitespace bytes, assert/verify the next byte is b':',
skip any ASCII whitespace again, verify the next byte is the opening quote b'"',
and then use memchr_byte to find the closing quote and return the slice between
quotes; refer to peek_type, memchr_find, memchr_byte and the variables
pos/value_start/remaining when making the change.

---

Duplicate comments:
In `@crates/polymarket/src/market/decoder.rs`:
- Around line 176-201: The emitted BookDeltaPayload.depth currently uses the
enumerate() index (depth_idx) which becomes incorrect when parse_price_qty(...)
returns None and the loop continues; change each bid and ask loop to maintain a
separate depth counter (e.g., bid_depth / ask_depth initialized to 0) that is
only incremented when you successfully parse and emit a level so depths are
contiguous (0 = best level) and assign that counter to BookDeltaPayload.depth;
apply the same fix to both the bids loop (for (depth_idx, level) in
msg.bids.iter().enumerate()) and the asks loop so skipped/invalid levels do not
create depth holes.

---

Nitpick comments:
In `@benchmarks/rust/src/stats.rs`:
- Around line 379-395: The test's assertion is too weak; replace the loose range
check with an exact contract check so overflow behavior can't drift. In the
histogram_overflow_buckets test, assert that CycleHistogram::percentile(99.99)
equals the expected exclusive overflow upper bound (use assert_eq!(p9999,
131072)) or alternatively call h.overflow_upper_bound(...) and assert that it
returns 131072; update the assertion replacing the current assert!(p9999 > 4095,
...) to an exact equality assertion referencing percentile (and/or
overflow_upper_bound) to lock the overflow-bucket contract.

In `@crates/polymarket/src/market/decoder.rs`:
- Around line 61-66: Replace the wildcard arms in the message-type dispatches
with explicit branches: in the match over msg_type (the block that currently has
"price_change", "last_trade_price", "book" and `_ => 0`) change the `_` arm to a
named binding (e.g., other) and handle it explicitly by logging/handling the
unrecognized msg_type and returning 0; do the same for the other dispatch at the
same file (the match around lines 248-252). Specifically update the match in the
decoder (the function that calls self.decode_price_change / self.decode_trade /
self.decode_book) to replace `_ => 0` with an explicit arm like other => { /*
log unrecognized message type using other */ 0 } so there are no wildcard `_`
matches while preserving existing behavior.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4f70fa48-1426-4937-a132-d0191d424b06

📥 Commits

Reviewing files that changed from the base of the PR and between 23500a2 and 11282f1.

📒 Files selected for processing (3)

benchmarks/rust/src/stats.rs
benchmarks/scripts/setup_machine.sh
crates/polymarket/src/market/decoder.rs

🚧 Files skipped from review as they are similar to previous changes (1)

benchmarks/scripts/setup_machine.sh

coderabbitai · 2026-04-07T07:49:45Z

+            implementation: implementation.to_string(),
+            language: "Rust".to_string(),
+            version: env!("CARGO_PKG_VERSION").to_string(),
+            compiler: rustc_version(),


⚠️ Potential issue | 🟠 Major

Capture the build compiler, not the runner's rustc.

Every JSON result serializes this value, but rustc --version here reports whatever toolchain happens to be installed on the machine executing the benchmark. In the deploy/run workflow that can stamp results with the wrong compiler version or "unknown" even though the binary was built with a known toolchain.

Based on learnings, "All benchmarks must export JSON for cross-hardware comparison".

Also applies to: 212-219

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@benchmarks/rust/src/stats.rs` at line 181, The serialized `compiler` field is calling rustc_version() at runtime (e.g., the `compiler: rustc_version()` assignment in stats.rs), which reports the runner's toolchain; instead capture the compiler at build time via a build script and read that value at runtime. Add a build.rs that queries the compiler (e.g., using the rustc_version crate or `rustc --version`) and emits it with println!("cargo:rustc-env=BUILD_RUSTC_VERSION={}") and then replace calls to rustc_version() in stats.rs (and the other spots around the 212-219 block) to read the env var (e.g., env!("BUILD_RUSTC_VERSION") or std::env::var at runtime) so the JSON contains the toolchain used to build the binary rather than the runner's rustc.

coderabbitai · 2026-04-07T07:49:45Z

+fn peek_type(buf: &[u8]) -> Option<&str> {
+    // Search for the byte pattern: "type":"
+    let needle = b"\"type\":\"";
+    let pos = memchr_find(buf, needle)?;
+    let value_start = pos + needle.len();
+
+    // Find the closing quote
+    let remaining = buf.get(value_start..)?;
+    let end = memchr_byte(b'"', remaining)?;
+
+    // SAFETY: JSON values in the "type" field are always ASCII
+    core::str::from_utf8(remaining.get(..end)?).ok()
+}


⚠️ Potential issue | 🟠 Major

peek_type currently rejects valid JSON formatting variants.

Line 274 hardcodes "\"type\":\""; valid JSON like {"type" : "book"} won’t match and gets dropped. This creates avoidable decode misses.

Suggested fix

fn peek_type(buf: &[u8]) -> Option<&str> { - // Search for the byte pattern: "type":" - let needle = b"\"type\":\""; - let pos = memchr_find(buf, needle)?; - let value_start = pos + needle.len(); - - // Find the closing quote - let remaining = buf.get(value_start..)?; - let end = memchr_byte(b'"', remaining)?; - - // SAFETY: JSON values in the "type" field are always ASCII - core::str::from_utf8(remaining.get(..end)?).ok() + let key = b"\"type\""; + let pos = memchr_find(buf, key)?; + let mut i = pos + key.len(); + while matches!(buf.get(i), Some(b' ' | b'\n' | b'\r' | b'\t')) { i += 1; } + if *buf.get(i)? != b':' { + return None; + } + i += 1; + while matches!(buf.get(i), Some(b' ' | b'\n' | b'\r' | b'\t')) { i += 1; } + if *buf.get(i)? != b'"' { + return None; + } + i += 1; + let rest = buf.get(i..)?; + let end = memchr_byte(b'"', rest)?; + core::str::from_utf8(rest.get(..end)?).ok() }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@crates/polymarket/src/market/decoder.rs` around lines 272 - 284, peek_type currently only matches the exact bytes "\"type\":\"" and so misses valid JSON with whitespace; update peek_type to first locate the key bytes b"\"type\"" (using memchr_find), then advance past that key, skip any ASCII whitespace bytes, assert/verify the next byte is b':', skip any ASCII whitespace again, verify the next byte is the opening quote b'"', and then use memchr_byte to find the closing quote and return the slice between quotes; refer to peek_type, memchr_find, memchr_byte and the variables pos/value_start/remaining when making the change.

core_affinity only sees cores in the process's default affinity mask, which excludes isolcpus cores. Use sched_setaffinity(2) directly via libc which can pin to any online core including isolated ones. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

pthread_setaffinity_np may silently fail on isolated cores. Use sched_setaffinity(2) directly with error checking, matching the Rust harness approach. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Both producer and consumer now run on spawned threads (not main). Consumer records raw cycle deltas into a pre-allocated Vec/vector, then builds the histogram after measurement completes. This avoids cache pollution from histogram writes during the hot loop, which was inflating C++ numbers by ~25x. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

No traits, no Arc, no Vec, no Result unwrapping. Uses leaked Box for shared state, usize-cast for Send, inline asm rdtsc, and sum += delta matching the HFT University run_latency() exactly. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Two &mut references to the same SpscRing from different threads is UB per Rust aliasing rules. Use spsc_ring() split handles instead — they access the engine through Arc<RingEngine> via &self, which is sound. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Inline RawRing with no Arc, no Result, no &mut self, no generics. push() -> bool, pop(&mut out) -> bool. Stack-local via Box::leak. Raw pointer field projections avoid Rust noalias interference. Same algorithm as RingEngine but stripped to bare metal. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…wrap - Colocate head+tail_cached on same 64B cache line (producer-local) - Colocate tail+head_cached on same 64B cache line (consumer-local) - Use 64-byte alignment instead of 128-byte (x86_64 native cache line) - Branch-based index wrapping (branch predictor > always-execute AND) - Enable LTO + codegen-units=1 for cross-crate inlining - pop() takes raw *mut Msg to avoid any &mut reference overhead Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…l API, inline(always) Benchmark-driven optimizations closing the gap with C++ rigtorp: - Add CacheLine<T> with native alignment (64B x86, 128B ARM) - Colocate head+tail_cached on producer CacheLine, tail+head_cached on consumer CacheLine — reduces hot-path cache footprint from 4 lines to 2 - Add push() -> bool and unsafe pop(*mut T) -> bool to RingEngine, RawRing, Producer, Consumer — no Result overhead, no return-by-value - Add BranchWrap index strategy (branch predictor > bitmask AND) - Upgrade all hot-path functions to #[inline(always)] - Add read_into slot operation for zero-copy pop into caller buffer Fully backwards-compatible: existing try_push/try_pop Result API and Arc-based split handles are unchanged. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…iants - Add push_shared/pop_shared on RawRing taking &self instead of &mut self to avoid LLVM noalias interference in two-thread benchmarks - Raw bench now tests all 3 Rust queues: mantis-inline (push_shared), mantis-copy (push/pop &self), rtrb (push/pop Result) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Add SpscRingFast preset using BranchWrap index strategy - Upgrade InlineStorage::capacity and slot_ptr to inline(always) - Raw bench now tests: Pow2Masked, BranchWrap, copy, and rtrb Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Remove dead benchmark modules (harness, message, queues, rdtsc, stats) - Simplify main.rs to only use raw_bench protocol - Fix clippy: backticks in docs, allow inline_always workspace-wide, use &raw mut in tests - Add comprehensive RESULTS.md with methodology, hardware, configuration, and final benchmark numbers (472 cycles/op SpscRingFast) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replace single-thread SPSC claims (<10ns) with measured cross-core latency (130ns / 472 TSC cycles on isolated cores). Remove "same ballpark as Optiver/LMAX" claim. Add link to benchmarks/RESULTS.md. Update benchmark table, quick start, and highlights to reflect actual measured performance. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…o C++ Rust: consumer thread captures perf_event_open grouped counters (instructions, branch misses, L1D misses, LLC misses) per-op. Printed alongside cycles/op when available (Linux only). C++: add --mode raw flag for sum/count protocol matching HFT University. Tests both rigtorp AND drogalis in raw mode. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…parison Identical algorithm to library SpscRingFast but with no Storage trait, no IndexStrategy, no Instrumentation, no RawRing wrapper — pure struct with inline push/pop. Shows the true floor of Rust SPSC latency vs the library abstraction cost. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Two optimizations identified via assembly analysis: 1. Shadow tail_local/head_local on x86_64: LLVM won't fold a Relaxed load after a Release store to the same atomic. Cell shadow breaks the dependency chain. Expected: ~30-50 cycles/op improvement. 2. Combine rdtsc inside asm block: "shl rdx,32; or rax,rdx" produces 2 instructions vs the previous 3 (shl + mov eax,eax + add). Expected: ~2-5 cycles/op improvement. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Milerius and others added 10 commits April 6, 2026 17:21

feat(fixed): add parse_decimal_bytes for zero-alloc hot-path price pa…

2242441

…rsing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

bench: add venue decoder benchmarks for Binance and Polymarket

901d58e

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

style: fix formatting in decode benchmark

68d65d0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

pull-request-size Bot added the size/XXL label Apr 7, 2026

Milerius and others added 2 commits April 7, 2026 08:44

Milerius force-pushed the feat/venue-decoder-crates branch from 3d1fa1c to 37b8138 Compare April 7, 2026 06:44

Milerius and others added 3 commits April 7, 2026 08:47

chore: remove tracked docs/superpowers files (gitignored)

991982e

These files were committed before the .gitignore rule was added. Remove them from tracking — they remain on disk but are ignored. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

style: format live_feeds.rs

49d8aed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Milerius and others added 2 commits April 7, 2026 09:01

coderabbitai Bot reviewed Apr 7, 2026

View reviewed changes

Milerius and others added 2 commits April 7, 2026 09:10

style: fix formatting in polymarket decoder tests

4de2cea

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai Bot reviewed Apr 7, 2026

View reviewed changes

Comment thread crates/polymarket/src/market/decoder.rs

coderabbitai Bot reviewed Apr 7, 2026

View reviewed changes

Milerius and others added 9 commits April 7, 2026 10:31

Milerius force-pushed the feat/venue-decoder-crates branch from f4603ab to a765d86 Compare April 7, 2026 13:09

Milerius force-pushed the feat/venue-decoder-crates branch from da6e2bd to c7f2eba Compare April 7, 2026 13:15

Milerius and others added 7 commits April 7, 2026 15:25

docs: restore emoji highlights in README

239f6fd

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

docs: add Optiver/LMAX inspiration note

70f3373

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Milerius merged commit 04d93a4 into main Apr 7, 2026
19 of 20 checks passed

coderabbitai Bot mentioned this pull request Apr 11, 2026

feat: session 2 — spawn, timer, monitor, perf optimizations #29

Merged

10 tasks

Conversation

Milerius commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

SPSC Benchmark Results (AMD Ryzen 7 PRO 8700GE, isolated cores)

Key optimizations that closed the Rust-C++ gap

Test plan

Uh oh!

coderabbitai Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

github-actions Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Benchmark Report

Sequence Lock (mantis-seqlock)

seqlock_read_contended

seqlock_read_uncontended

seqlock_write

seqlock_read_contended

seqlock_read_uncontended

seqlock_write

Fixed-Point Arithmetic (mantis-fixed)

fixed_checked_add

fixed_checked_div

fixed_checked_mul_trunc

fixed_decimal_parse

fixed_display

fixed_mul_round_vs_trunc

fixed_parse

fixed_rescale

fixed_checked_add

fixed_checked_div

fixed_checked_mul_trunc

fixed_decimal_parse

fixed_display

fixed_mul_round_vs_trunc

fixed_parse

fixed_rescale

Market-State Engine (mantis-market-state)

market_state_array_book

market_state_engine

market_state_array_book

market_state_engine

Uh oh!

codecov Bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Milerius commented Apr 7, 2026 •

edited

Loading

coderabbitai Bot commented Apr 7, 2026 •

edited

Loading

github-actions Bot commented Apr 7, 2026 •

edited

Loading

codecov Bot commented Apr 7, 2026 •

edited

Loading