You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
audit issue #23: measurement integrity and transparency
Core changes:
- Make verify_fn a hard gate: runners now skip timed measurement when
verification fails, including immediate-mode cases.
- Add raw + cleaned timing stats in TimingStats/BenchmarkStats; expose
raw_mean/median/stddev/cv/sample_count and outliers_removed in
JSON, CSV, and Markdown reports.
- New CLI flags --no-outlier-removal and --include-unstable-in-scores.
- Exclude high-CV results from composite scores by default; note the
count in Markdown reports.
- Raise max_retries default from 0 to 1.
- Add single-thread default warning in both binaries.
- Document vx_perf median caveat as median_is_avg_approximation in JSON.
- Add OpenCV comparison framing note to compareReports output.
- Add scripts/check_report.py and wire it into CI smoke jobs.
Refs: #23
0 commit comments