Commit 0069991
authored
* feat(security): IS-060 PR-2 — datamarking transform + corpus bed (#90 / Loop 23)
Implements Option C (Microsoft Spotlighting datamarking) from
docs/architecture/SPOTLIGHTING_INDIRECT_INJECTION.md §3.3.
Changes:
- crates/llmtrace-security/src/datamarking.rs (new): pure
DatamarkingTransform + MarkedZone + PUA_RANGE constants. Idempotent,
random-marker-per-request, collision-resampling.
- crates/llmtrace-proxy/src/datamarking_pipeline.rs (new): proxy-side
pipeline. Splices marked content back into the request body,
amends boundary's system reminder with a marker-aware sentence in
active mode only (never in shadow mode — would be a lie).
- crates/llmtrace-core/src/lib.rs: DatamarkingConfig + MarkerStrategy
on BoundaryTokenConfig. Default `enabled = false`,
`shadow_mode = true`, `marker_strategy = Randomized`.
- crates/llmtrace-proxy/src/proxy.rs: pipeline runs after boundary
defense per §4.5. Emits a Severity::Info `spotlighting_applied`
finding for audit-trail (action_router.rs:192 filters Info out).
- crates/llmtrace-proxy/src/metrics.rs: four Prometheus counters —
spotlighting_zones_total{kind,shadow}, byte_delta_total{shadow},
marker_collision_total, failures_total{reason}.
- config.example.yaml: documents `boundary_defense.datamarking`.
- benchmarks/benches/datamarking.rs (new): criterion bench — apply on
small/medium/large_16k zones, fixed and randomized strategies.
- benchmarks/attacks/indirect_injection/ + 5 new YAMLs (3 BIPIA Email
QA + 2 authored summarization), tagged `is-060-pr-2-bed`. The
authored ones cover the BIPIA snapshot gap (no `summarization`
subcategory in the on-disk dataset).
- scripts/e2e/seed_is_060_pr_2_corpus.py (new): deterministic seeder,
prints only metadata — never echoes BIPIA prompt content to stdout.
Test coverage:
- 14 unit tests in datamarking.rs (marker substitution on every
whitespace class, ZWSP-not-substituted by design, idempotence,
collision resampling, PUA randomness, reminder addendum,
multi-zone order).
- 5 unit tests in core DatamarkingConfig (defaults, serde round-trip,
partial-override shadow default = true).
- 8 unit tests in datamarking_pipeline (disabled/shadow/active modes,
idempotence, instruction passthrough, collision counting).
- 5 integration tests in tests/integration_test.rs (proxy-level
disabled passthrough, shadow forwards original, active substitutes
in data zone only, composes after boundary defense per §4.5,
spotlighting_applied Info finding emitted).
* fix(corpus): drop 5 redundant scenarios — PR-3 covers these BIPIA rows
After merging PR-3 (#213, BIPIA corpus expansion across 5 tasks) into
main, the 5 corpus YAMLs that PR-2 bundled as a datamarking test bed
are now redundant:
- bipia-bipia-attack-email-00000-013.yaml — PR-3 imported row 00000 as 011
- bipia-bipia-attack-email-00001-014.yaml — PR-3 imported row 00001 as 012
- bipia-bipia-attack-email-00002-015.yaml — PR-3 imported row 00002 as 013
- is-060-pr2-authored-summarization-001-011.yaml — PR-3 synthesised 7 summarisation scenarios (#99001–99007)
- is-060-pr2-authored-summarization-002-012.yaml — same as above
PR-3's coverage is more diverse (9 email + 7 table + 3 code + 7 web +
7 summarisation, 33 total) and authoritative as the project's BIPIA
corpus. PR-2's value is the datamarking transform + tests + metrics
+ config. The transform doesn't depend on these YAMLs existing —
the new corpus from PR-3 is the measurement bed.
Validator: 91/91 scenarios valid (58 prior + 33 from PR-3).
1 parent 43272cf commit 0069991
12 files changed
Lines changed: 1963 additions & 3 deletions
File tree
- benchmarks
- benches
- crates
- llmtrace-core/src
- llmtrace-proxy
- src
- tests
- llmtrace-security/src
- scripts/e2e
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
52 | 52 | | |
53 | 53 | | |
54 | 54 | | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
390 | 390 | | |
391 | 391 | | |
392 | 392 | | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
393 | 409 | | |
394 | 410 | | |
395 | 411 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2685 | 2685 | | |
2686 | 2686 | | |
2687 | 2687 | | |
| 2688 | + | |
| 2689 | + | |
| 2690 | + | |
| 2691 | + | |
| 2692 | + | |
2688 | 2693 | | |
2689 | 2694 | | |
2690 | 2695 | | |
| |||
2709 | 2714 | | |
2710 | 2715 | | |
2711 | 2716 | | |
| 2717 | + | |
| 2718 | + | |
| 2719 | + | |
| 2720 | + | |
| 2721 | + | |
| 2722 | + | |
| 2723 | + | |
| 2724 | + | |
| 2725 | + | |
| 2726 | + | |
| 2727 | + | |
| 2728 | + | |
| 2729 | + | |
| 2730 | + | |
| 2731 | + | |
| 2732 | + | |
| 2733 | + | |
| 2734 | + | |
| 2735 | + | |
| 2736 | + | |
| 2737 | + | |
| 2738 | + | |
| 2739 | + | |
| 2740 | + | |
| 2741 | + | |
| 2742 | + | |
| 2743 | + | |
| 2744 | + | |
| 2745 | + | |
| 2746 | + | |
| 2747 | + | |
| 2748 | + | |
| 2749 | + | |
| 2750 | + | |
| 2751 | + | |
| 2752 | + | |
| 2753 | + | |
| 2754 | + | |
| 2755 | + | |
| 2756 | + | |
| 2757 | + | |
| 2758 | + | |
| 2759 | + | |
| 2760 | + | |
| 2761 | + | |
| 2762 | + | |
| 2763 | + | |
| 2764 | + | |
| 2765 | + | |
| 2766 | + | |
| 2767 | + | |
| 2768 | + | |
| 2769 | + | |
| 2770 | + | |
| 2771 | + | |
| 2772 | + | |
| 2773 | + | |
| 2774 | + | |
| 2775 | + | |
| 2776 | + | |
| 2777 | + | |
| 2778 | + | |
| 2779 | + | |
| 2780 | + | |
| 2781 | + | |
| 2782 | + | |
| 2783 | + | |
| 2784 | + | |
| 2785 | + | |
| 2786 | + | |
| 2787 | + | |
2712 | 2788 | | |
2713 | 2789 | | |
2714 | 2790 | | |
| |||
4994 | 5070 | | |
4995 | 5071 | | |
4996 | 5072 | | |
| 5073 | + | |
| 5074 | + | |
| 5075 | + | |
| 5076 | + | |
| 5077 | + | |
| 5078 | + | |
| 5079 | + | |
| 5080 | + | |
| 5081 | + | |
| 5082 | + | |
| 5083 | + | |
| 5084 | + | |
| 5085 | + | |
| 5086 | + | |
| 5087 | + | |
| 5088 | + | |
| 5089 | + | |
| 5090 | + | |
| 5091 | + | |
| 5092 | + | |
| 5093 | + | |
| 5094 | + | |
| 5095 | + | |
| 5096 | + | |
| 5097 | + | |
| 5098 | + | |
| 5099 | + | |
| 5100 | + | |
| 5101 | + | |
| 5102 | + | |
| 5103 | + | |
| 5104 | + | |
| 5105 | + | |
| 5106 | + | |
| 5107 | + | |
| 5108 | + | |
| 5109 | + | |
| 5110 | + | |
| 5111 | + | |
| 5112 | + | |
| 5113 | + | |
| 5114 | + | |
| 5115 | + | |
| 5116 | + | |
| 5117 | + | |
| 5118 | + | |
| 5119 | + | |
| 5120 | + | |
| 5121 | + | |
| 5122 | + | |
| 5123 | + | |
| 5124 | + | |
| 5125 | + | |
| 5126 | + | |
| 5127 | + | |
| 5128 | + | |
0 commit comments