|
| 1 | +# Audit-coverage runbook |
| 2 | + |
| 3 | +This file is the entry point for an external review firm that wants |
| 4 | +to reproduce the Mosaic audit-coverage matrix locally and extend it |
| 5 | +with their own tests. It complements [`AUDIT.md`](../AUDIT.md) |
| 6 | +(milestone log) and the per-session CHANGELOG entries (work |
| 7 | +provenance). |
| 8 | + |
| 9 | +## Coverage matrix at v0.8.3-shared-primitive-lift |
| 10 | + |
| 11 | +| Surface | Coverage | Source | |
| 12 | +|---|---|---| |
| 13 | +| Property tests | 544 lib tests across 12 crates (+147 proptest in audit-coverage sweep) | sessions 36-66 | |
| 14 | +| BPF CU bench | 7 systems via `solana-program-test` | sessions 47, 49 | |
| 15 | +| Host criterion bench | 5 systems with statistical noise floor | session 51 | |
| 16 | +| Fuzz harnesses | 23 targets across 6 production verifiers | sessions 54-59 | |
| 17 | +| Shared primitives | 8 audit-grade helpers in `mosaic-zk-primitives` | sessions 21-66 | |
| 18 | +| CI activation | All harnesses wired into GitHub Actions | session 61 | |
| 19 | + |
| 20 | +## How to reproduce locally |
| 21 | + |
| 22 | +### Property tests |
| 23 | + |
| 24 | +```bash |
| 25 | +# Per-crate (fast, targeted) |
| 26 | +cargo test -p mosaic-halo2 --features std --lib |
| 27 | +cargo test -p mosaic-hyperplonk --features std --lib |
| 28 | +# ... repeat for each crate |
| 29 | + |
| 30 | +# Full workspace sweep (~30 sec on a stock laptop after warm cache) |
| 31 | +cargo test --workspace --all-features --lib |
| 32 | +``` |
| 33 | + |
| 34 | +Expected output: `test result: ok` for every crate. The reported |
| 35 | +counts in this runbook (e.g. "75 passed" for mosaic-halo2) are |
| 36 | +authoritative as of v0.8.3. |
| 37 | + |
| 38 | +### BPF CU regression bench |
| 39 | + |
| 40 | +```bash |
| 41 | +# 1. Build the SBF program (one-time per change to mosaic-program/). |
| 42 | +cargo build-sbf --tools-version v1.52 \ |
| 43 | + --manifest-path crates/mosaic-program/Cargo.toml |
| 44 | + |
| 45 | +# 2. Set the artifact path (cargo-build-sbf default). |
| 46 | +export BPF_OUT_DIR="$PWD/target/deploy" |
| 47 | +export SBF_OUT_DIR="$PWD/target/deploy" |
| 48 | + |
| 49 | +# 3. Run the bench. Fails on ADR-0005 hard-cap breach (exit 1). |
| 50 | +cargo run --release -p mosaic-bench --bin bpf-bench |
| 51 | +``` |
| 52 | + |
| 53 | +The bench prints a per-system table: |
| 54 | + |
| 55 | +``` |
| 56 | +SYSTEM MEASURED CAP BASELINE STATUS |
| 57 | +groth16_bn254_mul_circuit_1pi 83574 180000 83574 OK |
| 58 | +groth16_batch_n5_mul_circuit_1pi 258397 300000 258397 OK |
| 59 | +plonk_bn254_mul_circuit_1pi 968457 1100000 968457 OK |
| 60 | +hyperplonk_kzg_bn254_scaffold <new> 660000 0 OK |
| 61 | +halo2_kzg_bn254_scaffold <new> 760000 0 OK |
| 62 | +nova_folding_bn254_scaffold <new> 1150000 0 OK |
| 63 | +fri_stark_goldilocks_scaffold <new> 7800000 0 OK |
| 64 | +``` |
| 65 | + |
| 66 | +Hard caps come from ADR-0005 (Phase-1/2) and verifier |
| 67 | +`estimated_compute_units` × 1.30 (Phase-3, sessions 47, 49). New |
| 68 | +Phase-3 baselines start at `0` because the WARN-on-drift logic |
| 69 | +needs an established baseline; the first successful run records the |
| 70 | +baseline for future regression tracking (see |
| 71 | +`crates/mosaic-bench/src/bin/bpf_bench.rs`). |
| 72 | + |
| 73 | +### Host criterion bench |
| 74 | + |
| 75 | +```bash |
| 76 | +# Two bench files; the matrix runs both. |
| 77 | +cargo bench -p mosaic-bench --bench groth16_host |
| 78 | +cargo bench -p mosaic-bench --bench phase3_host |
| 79 | +``` |
| 80 | + |
| 81 | +Criterion writes statistical reports under `target/criterion/`. |
| 82 | +A real algorithmic regression surfaces as a measurable shift |
| 83 | +distinct from JIT/codegen drift. |
| 84 | + |
| 85 | +### Fuzz harnesses |
| 86 | + |
| 87 | +```bash |
| 88 | +# One-time setup |
| 89 | +cargo install cargo-fuzz --locked |
| 90 | + |
| 91 | +# Run any of the 23 harnesses (5-min budget, expand for serious |
| 92 | +# corpus development). |
| 93 | +cargo +nightly fuzz run --fuzz-dir crates/mosaic-fuzz \ |
| 94 | + fuzz_halo2_combined -- -max_total_time=300 |
| 95 | +``` |
| 96 | + |
| 97 | +Full inventory: |
| 98 | + |
| 99 | +| Surface | Targets | |
| 100 | +|---|---| |
| 101 | +| Phase-1 Groth16 | `fuzz_groth16_proof_bytes`, `fuzz_vk_bytes`, `fuzz_public_inputs` | |
| 102 | +| Phase-2 PLONK | `fuzz_plonk_{proof_bytes, vk_bytes, public_inputs, combined}` | |
| 103 | +| Phase-3 HyperPlonk | `fuzz_hyperplonk_{proof_bytes, vk_bytes, public_inputs, combined}` | |
| 104 | +| Phase-3 Halo2 | `fuzz_halo2_{proof_bytes, vk_bytes, public_inputs, combined}` | |
| 105 | +| Phase-3 Nova | `fuzz_nova_{proof_bytes, vk_bytes, public_inputs, combined}` | |
| 106 | +| Phase-3 STARK | `fuzz_stark_{proof_bytes, vk_bytes, public_inputs, combined}` | |
| 107 | + |
| 108 | +The combined-slot fuzzers split the libfuzzer input into three |
| 109 | +length-prefixed sub-buffers (vk, proof, public_inputs) and explore |
| 110 | +cross-slot interaction surface — bugs that only surface when two |
| 111 | +slots lie about the same shape parameter in a coordinated way. |
| 112 | +Halo2 and STARK have the richest cross-check fingerprints; PLONK |
| 113 | +the narrowest. |
| 114 | + |
| 115 | +## How to extend the coverage |
| 116 | + |
| 117 | +### Add a new property test |
| 118 | + |
| 119 | +1. Pick a soundness-critical invariant (e.g. "every byte flip in |
| 120 | + the commit region must reject"). |
| 121 | +2. Write the test in the relevant crate's `src/<module>.rs` |
| 122 | + `#[cfg(test)] mod tests {}` block, prefixed with `proptest_*`. |
| 123 | +3. Add a docstring explaining what the property pins and why an |
| 124 | + external auditor cares. The session-37-52 commits are the |
| 125 | + canonical examples; they document false positives inline. |
| 126 | +4. Run locally with `cargo test -p <crate> --features std --lib`. |
| 127 | + |
| 128 | +### Add a new fuzz harness |
| 129 | + |
| 130 | +1. Pick a parser surface not already covered (e.g. a new adapter |
| 131 | + format). |
| 132 | +2. Write the harness as `crates/mosaic-fuzz/fuzz_targets/<name>.rs` |
| 133 | + following the session-54-59 template (single-slot variant) or |
| 134 | + session-56's `split_three_slots` (combined variant). |
| 135 | +3. Add a `[[bin]]` entry to `crates/mosaic-fuzz/Cargo.toml`. |
| 136 | +4. Add the target to both PR-mode and nightly-mode matrices in |
| 137 | + `.github/workflows/fuzz.yml`. |
| 138 | + |
| 139 | +### Add a new bench |
| 140 | + |
| 141 | +1. For a host-side bench: add a new `[[bench]]` to |
| 142 | + `crates/mosaic-bench/Cargo.toml` and a corresponding |
| 143 | + `benches/<name>.rs` following the session-51 `phase3_host` |
| 144 | + template. |
| 145 | +2. For an on-chain CU bench: add a `SystemTarget` entry to the |
| 146 | + `TARGETS` const in `crates/mosaic-bench/src/bin/bpf_bench.rs`, |
| 147 | + plus an inline `build_*_scaffold_fixture` builder, plus a |
| 148 | + dispatch arm in `main`. Sessions 47 + 49 are the canonical |
| 149 | + examples. |
| 150 | +3. Add the harness to the matrix in `.github/workflows/bench.yml`. |
| 151 | + |
| 152 | +### Add a new shared primitive |
| 153 | + |
| 154 | +1. Add the function to the relevant `mosaic-zk-primitives::*` |
| 155 | + module with a `# Errors` docstring section. |
| 156 | +2. Write at least one proptest pinning the soundness invariant |
| 157 | + (the canonical example is session 63's |
| 158 | + `prop_horner_matches_naive_eval` against a naive |
| 159 | + sum-of-products implementation). |
| 160 | +3. Migrate the first in-tree consumer in a follow-up commit so |
| 161 | + the primitive has a real production user from session-1. |
| 162 | +4. Track remaining consumers in the commit message; future |
| 163 | + sessions migrate them in commit-sized batches. |
| 164 | + |
| 165 | +## What this coverage does NOT pin |
| 166 | + |
| 167 | +**Important for the auditor's threat model.** The coverage above |
| 168 | +exercises every byte-format, Fiat-Shamir, state-machine, and |
| 169 | +cross-slot invariant the verifier surface is supposed to enforce. |
| 170 | +It does NOT exercise: |
| 171 | + |
| 172 | +1. **Real prover output.** All bench + fuzz fixtures are |
| 173 | + scaffold-acceptance bytes (uniformly zero or structurally |
| 174 | + valid but cryptographically vacuous). A successful BPF run on |
| 175 | + the scaffold fixture means "the verifier completes the |
| 176 | + pipeline and accepts" — NOT "the verifier rejects every |
| 177 | + forged real proof". Closing this gap is the |
| 178 | + "fixture-driven differential testing" item in the planned-beyond |
| 179 | + block of CHANGELOG.md. |
| 180 | + |
| 181 | +2. **Full cryptographic soundness of Phase-3 reductions.** |
| 182 | + HyperPlonk's multi-point-to-univariate reduction |
| 183 | + (Zeromorph / PST / Gemini), Halo2's full lookup argument, |
| 184 | + Nova's complete folding consistency, FRI-STARK's per-layer |
| 185 | + fold + low-degree test — each is implemented as a "scaffold" |
| 186 | + that exercises the full syscall chain but stops short of the |
| 187 | + complete soundness check. Per-crate rustdoc spells out the |
| 188 | + scaffold caveat in detail. |
| 189 | + |
| 190 | +3. **External `solana-program-test` integration for the chunked |
| 191 | + upload protocol.** `mosaic-program::chunked::dispatch` has |
| 192 | + property tests on its data shape but no integration test |
| 193 | + driving the full PDA + AccountInfo lifecycle. Tracked in the |
| 194 | + planned-beyond block. |
| 195 | + |
| 196 | +The auditor should treat the coverage matrix as a guarantee that |
| 197 | +the verifier surface fails closed on hostile bytes, not as a |
| 198 | +guarantee that every reachable code path is cryptographically |
| 199 | +sound. The latter requires fixture-driven differential testing |
| 200 | +against external prover implementations. |
| 201 | + |
| 202 | +## Where to file issues |
| 203 | + |
| 204 | +| Type | Channel | |
| 205 | +|---|---| |
| 206 | +| Soundness bug found by fuzzer | GitHub Issues, label `security` | |
| 207 | +| Property-test false positive | GitHub Issues, label `audit-coverage` | |
| 208 | +| Bench baseline drift | GitHub Issues, label `cu-regression` | |
| 209 | +| Vulnerability disclosure | See [SECURITY.md](../SECURITY.md) — DO NOT use GitHub Issues for unreported vulnerabilities | |
0 commit comments