Skip to content

Commit 84faeda

Browse files
committed
docs: session 70 — audit-coverage runbook for external review firms
Adds `docs/audit-coverage-runbook.md` as the entry point for an external review firm that wants to reproduce the Mosaic audit-coverage matrix locally and extend it with their own tests. Complements the existing audit-pack: - `AUDIT.md` — milestone log (per-release entries) - `CHANGELOG.md` — per-session work provenance - `README.md § Status` — at-a-glance coverage table - `docs/threat-model.md` — adversarial input vectors - `docs/lint-policy.md` — clippy suppression registry - **`docs/audit-coverage-runbook.md` (this commit)** — runbook + extension guide Sections 1. Coverage matrix at v0.8.3-shared-primitive-lift - Quick-reference table linking each surface to its session range. 2. How to reproduce locally - Property tests (per-crate + workspace sweep). - BPF CU regression bench (full SBF build + run cycle). - Host criterion bench (both groth16_host + phase3_host). - Fuzz harnesses (with the 23-target inventory laid out by surface dimension). 3. How to extend the coverage - Step-by-step recipes for adding a new property test, fuzz harness, bench, or shared primitive. Each recipe references the canonical session commits as worked examples. 4. What this coverage does NOT pin - Three explicit caveats so the auditor's threat model is calibrated correctly: * scaffold-acceptance fixtures vs real prover output; * "scaffold" Phase-3 reductions vs full cryptographic soundness; * chunked dispatch integration test gap. - Each caveat names the planned-beyond item in CHANGELOG.md that closes the gap. 5. Where to file issues - Issue triage matrix with labels for soundness bugs, property-test false positives, and bench baseline drift. - SECURITY.md cross-link for vulnerability disclosure (DO NOT use GitHub Issues for those). This is a documentation-only commit; no behaviour, ABI, or test changes. Audit firms can find this file via `AUDIT.md` (the runbook will be linked from the v0.8.3 audit log entry in a follow-up).
1 parent f3ec99c commit 84faeda

1 file changed

Lines changed: 209 additions & 0 deletions

File tree

docs/audit-coverage-runbook.md

Lines changed: 209 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
# Audit-coverage runbook
2+
3+
This file is the entry point for an external review firm that wants
4+
to reproduce the Mosaic audit-coverage matrix locally and extend it
5+
with their own tests. It complements [`AUDIT.md`](../AUDIT.md)
6+
(milestone log) and the per-session CHANGELOG entries (work
7+
provenance).
8+
9+
## Coverage matrix at v0.8.3-shared-primitive-lift
10+
11+
| Surface | Coverage | Source |
12+
|---|---|---|
13+
| Property tests | 544 lib tests across 12 crates (+147 proptest in audit-coverage sweep) | sessions 36-66 |
14+
| BPF CU bench | 7 systems via `solana-program-test` | sessions 47, 49 |
15+
| Host criterion bench | 5 systems with statistical noise floor | session 51 |
16+
| Fuzz harnesses | 23 targets across 6 production verifiers | sessions 54-59 |
17+
| Shared primitives | 8 audit-grade helpers in `mosaic-zk-primitives` | sessions 21-66 |
18+
| CI activation | All harnesses wired into GitHub Actions | session 61 |
19+
20+
## How to reproduce locally
21+
22+
### Property tests
23+
24+
```bash
25+
# Per-crate (fast, targeted)
26+
cargo test -p mosaic-halo2 --features std --lib
27+
cargo test -p mosaic-hyperplonk --features std --lib
28+
# ... repeat for each crate
29+
30+
# Full workspace sweep (~30 sec on a stock laptop after warm cache)
31+
cargo test --workspace --all-features --lib
32+
```
33+
34+
Expected output: `test result: ok` for every crate. The reported
35+
counts in this runbook (e.g. "75 passed" for mosaic-halo2) are
36+
authoritative as of v0.8.3.
37+
38+
### BPF CU regression bench
39+
40+
```bash
41+
# 1. Build the SBF program (one-time per change to mosaic-program/).
42+
cargo build-sbf --tools-version v1.52 \
43+
--manifest-path crates/mosaic-program/Cargo.toml
44+
45+
# 2. Set the artifact path (cargo-build-sbf default).
46+
export BPF_OUT_DIR="$PWD/target/deploy"
47+
export SBF_OUT_DIR="$PWD/target/deploy"
48+
49+
# 3. Run the bench. Fails on ADR-0005 hard-cap breach (exit 1).
50+
cargo run --release -p mosaic-bench --bin bpf-bench
51+
```
52+
53+
The bench prints a per-system table:
54+
55+
```
56+
SYSTEM MEASURED CAP BASELINE STATUS
57+
groth16_bn254_mul_circuit_1pi 83574 180000 83574 OK
58+
groth16_batch_n5_mul_circuit_1pi 258397 300000 258397 OK
59+
plonk_bn254_mul_circuit_1pi 968457 1100000 968457 OK
60+
hyperplonk_kzg_bn254_scaffold <new> 660000 0 OK
61+
halo2_kzg_bn254_scaffold <new> 760000 0 OK
62+
nova_folding_bn254_scaffold <new> 1150000 0 OK
63+
fri_stark_goldilocks_scaffold <new> 7800000 0 OK
64+
```
65+
66+
Hard caps come from ADR-0005 (Phase-1/2) and verifier
67+
`estimated_compute_units` × 1.30 (Phase-3, sessions 47, 49). New
68+
Phase-3 baselines start at `0` because the WARN-on-drift logic
69+
needs an established baseline; the first successful run records the
70+
baseline for future regression tracking (see
71+
`crates/mosaic-bench/src/bin/bpf_bench.rs`).
72+
73+
### Host criterion bench
74+
75+
```bash
76+
# Two bench files; the matrix runs both.
77+
cargo bench -p mosaic-bench --bench groth16_host
78+
cargo bench -p mosaic-bench --bench phase3_host
79+
```
80+
81+
Criterion writes statistical reports under `target/criterion/`.
82+
A real algorithmic regression surfaces as a measurable shift
83+
distinct from JIT/codegen drift.
84+
85+
### Fuzz harnesses
86+
87+
```bash
88+
# One-time setup
89+
cargo install cargo-fuzz --locked
90+
91+
# Run any of the 23 harnesses (5-min budget, expand for serious
92+
# corpus development).
93+
cargo +nightly fuzz run --fuzz-dir crates/mosaic-fuzz \
94+
fuzz_halo2_combined -- -max_total_time=300
95+
```
96+
97+
Full inventory:
98+
99+
| Surface | Targets |
100+
|---|---|
101+
| Phase-1 Groth16 | `fuzz_groth16_proof_bytes`, `fuzz_vk_bytes`, `fuzz_public_inputs` |
102+
| Phase-2 PLONK | `fuzz_plonk_{proof_bytes, vk_bytes, public_inputs, combined}` |
103+
| Phase-3 HyperPlonk | `fuzz_hyperplonk_{proof_bytes, vk_bytes, public_inputs, combined}` |
104+
| Phase-3 Halo2 | `fuzz_halo2_{proof_bytes, vk_bytes, public_inputs, combined}` |
105+
| Phase-3 Nova | `fuzz_nova_{proof_bytes, vk_bytes, public_inputs, combined}` |
106+
| Phase-3 STARK | `fuzz_stark_{proof_bytes, vk_bytes, public_inputs, combined}` |
107+
108+
The combined-slot fuzzers split the libfuzzer input into three
109+
length-prefixed sub-buffers (vk, proof, public_inputs) and explore
110+
cross-slot interaction surface — bugs that only surface when two
111+
slots lie about the same shape parameter in a coordinated way.
112+
Halo2 and STARK have the richest cross-check fingerprints; PLONK
113+
the narrowest.
114+
115+
## How to extend the coverage
116+
117+
### Add a new property test
118+
119+
1. Pick a soundness-critical invariant (e.g. "every byte flip in
120+
the commit region must reject").
121+
2. Write the test in the relevant crate's `src/<module>.rs`
122+
`#[cfg(test)] mod tests {}` block, prefixed with `proptest_*`.
123+
3. Add a docstring explaining what the property pins and why an
124+
external auditor cares. The session-37-52 commits are the
125+
canonical examples; they document false positives inline.
126+
4. Run locally with `cargo test -p <crate> --features std --lib`.
127+
128+
### Add a new fuzz harness
129+
130+
1. Pick a parser surface not already covered (e.g. a new adapter
131+
format).
132+
2. Write the harness as `crates/mosaic-fuzz/fuzz_targets/<name>.rs`
133+
following the session-54-59 template (single-slot variant) or
134+
session-56's `split_three_slots` (combined variant).
135+
3. Add a `[[bin]]` entry to `crates/mosaic-fuzz/Cargo.toml`.
136+
4. Add the target to both PR-mode and nightly-mode matrices in
137+
`.github/workflows/fuzz.yml`.
138+
139+
### Add a new bench
140+
141+
1. For a host-side bench: add a new `[[bench]]` to
142+
`crates/mosaic-bench/Cargo.toml` and a corresponding
143+
`benches/<name>.rs` following the session-51 `phase3_host`
144+
template.
145+
2. For an on-chain CU bench: add a `SystemTarget` entry to the
146+
`TARGETS` const in `crates/mosaic-bench/src/bin/bpf_bench.rs`,
147+
plus an inline `build_*_scaffold_fixture` builder, plus a
148+
dispatch arm in `main`. Sessions 47 + 49 are the canonical
149+
examples.
150+
3. Add the harness to the matrix in `.github/workflows/bench.yml`.
151+
152+
### Add a new shared primitive
153+
154+
1. Add the function to the relevant `mosaic-zk-primitives::*`
155+
module with a `# Errors` docstring section.
156+
2. Write at least one proptest pinning the soundness invariant
157+
(the canonical example is session 63's
158+
`prop_horner_matches_naive_eval` against a naive
159+
sum-of-products implementation).
160+
3. Migrate the first in-tree consumer in a follow-up commit so
161+
the primitive has a real production user from session-1.
162+
4. Track remaining consumers in the commit message; future
163+
sessions migrate them in commit-sized batches.
164+
165+
## What this coverage does NOT pin
166+
167+
**Important for the auditor's threat model.** The coverage above
168+
exercises every byte-format, Fiat-Shamir, state-machine, and
169+
cross-slot invariant the verifier surface is supposed to enforce.
170+
It does NOT exercise:
171+
172+
1. **Real prover output.** All bench + fuzz fixtures are
173+
scaffold-acceptance bytes (uniformly zero or structurally
174+
valid but cryptographically vacuous). A successful BPF run on
175+
the scaffold fixture means "the verifier completes the
176+
pipeline and accepts" — NOT "the verifier rejects every
177+
forged real proof". Closing this gap is the
178+
"fixture-driven differential testing" item in the planned-beyond
179+
block of CHANGELOG.md.
180+
181+
2. **Full cryptographic soundness of Phase-3 reductions.**
182+
HyperPlonk's multi-point-to-univariate reduction
183+
(Zeromorph / PST / Gemini), Halo2's full lookup argument,
184+
Nova's complete folding consistency, FRI-STARK's per-layer
185+
fold + low-degree test — each is implemented as a "scaffold"
186+
that exercises the full syscall chain but stops short of the
187+
complete soundness check. Per-crate rustdoc spells out the
188+
scaffold caveat in detail.
189+
190+
3. **External `solana-program-test` integration for the chunked
191+
upload protocol.** `mosaic-program::chunked::dispatch` has
192+
property tests on its data shape but no integration test
193+
driving the full PDA + AccountInfo lifecycle. Tracked in the
194+
planned-beyond block.
195+
196+
The auditor should treat the coverage matrix as a guarantee that
197+
the verifier surface fails closed on hostile bytes, not as a
198+
guarantee that every reachable code path is cryptographically
199+
sound. The latter requires fixture-driven differential testing
200+
against external prover implementations.
201+
202+
## Where to file issues
203+
204+
| Type | Channel |
205+
|---|---|
206+
| Soundness bug found by fuzzer | GitHub Issues, label `security` |
207+
| Property-test false positive | GitHub Issues, label `audit-coverage` |
208+
| Bench baseline drift | GitHub Issues, label `cu-regression` |
209+
| Vulnerability disclosure | See [SECURITY.md](../SECURITY.md) — DO NOT use GitHub Issues for unreported vulnerabilities |

0 commit comments

Comments
 (0)