Skip to content

Commit d1f922b

Browse files
anhmtkcursoragent
andcommitted
docs: add StillMe Lite integration spec and case study assets
Define a lightweight integration contract and repeatable before/after evaluation template so contributors can measure hallucination-risk reduction with auditable metrics. Co-authored-by: Cursor <cursoragent@cursor.com>
1 parent cb6704d commit d1f922b

5 files changed

Lines changed: 515 additions & 0 deletions

File tree

README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,8 @@ It continuously learns, cites sources, and refuses to fabricate when evidence is
2727
- ▶️ **Start here**: [`docs/SUMMARY.md`](docs/SUMMARY.md)
2828
- 🧭 **Constitution**: [`docs/CONSTITUTION.md`](docs/CONSTITUTION.md)
2929
- 🧱 **Architecture**: [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md)
30+
- 🧩 **Lite Integration Spec**: [`docs/STILLME_LITE_SPEC.md`](docs/STILLME_LITE_SPEC.md)
31+
- 📈 **Case Study Template**: [`docs/CASE_STUDY_TEMPLATE.md`](docs/CASE_STUDY_TEMPLATE.md)
3032
- 🧪 **API Docs**: `http://localhost:8000/docs`
3133
- 🤝 **Contribute**: [`CONTRIBUTING.md`](CONTRIBUTING.md)
3234

@@ -668,6 +670,8 @@ We welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed se
668670
- [`docs/VALIDATION_CHAIN_SPEC.md`](docs/VALIDATION_CHAIN_SPEC.md) - Must-pass vs warning validators
669671
- [`docs/NO_SOURCE_POLICY.md`](docs/NO_SOURCE_POLICY.md) - Refusal rules when sources are missing
670672
- [`docs/AUDIT_GUIDE.md`](docs/AUDIT_GUIDE.md) - Contributor audit checklist
673+
- [`docs/STILLME_LITE_SPEC.md`](docs/STILLME_LITE_SPEC.md) - Lightweight integration contract (monitor/warn/enforce)
674+
- [`docs/CASE_STUDY_TEMPLATE.md`](docs/CASE_STUDY_TEMPLATE.md) - Before/after validation report template and metrics
671675
- [`docs/API_DOCUMENTATION.md`](docs/API_DOCUMENTATION.md) - Complete API reference
672676
- [`docs/DEPLOYMENT_GUIDE.md`](docs/DEPLOYMENT_GUIDE.md) - Deployment instructions
673677
- [`docs/PAPER_TABLES_FIGURES.md`](docs/PAPER_TABLES_FIGURES.md) - Tables and figures for the paper

docs/CASE_STUDY_TEMPLATE.md

Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
# StillMe Case Study Template (Before vs After Validation)
2+
3+
Use this template to produce audit-grade evidence for StillMe Lite performance.
4+
5+
Goal: show measurable improvement after enabling verification behavior.
6+
7+
---
8+
9+
## 1) Case Study Metadata
10+
11+
- Case ID:
12+
- Date:
13+
- Owner:
14+
- Domain (research/support/policy/other):
15+
- Language:
16+
- LLM model:
17+
- Retrieval setup:
18+
- StillMe Lite mode (`monitor|warn|enforce`):
19+
- Policy file version:
20+
21+
---
22+
23+
## 2) Test Scope
24+
25+
- Total prompts:
26+
- Prompt set source:
27+
- Prompt categories:
28+
- Factual query
29+
- Source-required query
30+
- Ambiguous query
31+
- Adversarial/prompt-injection
32+
33+
Rules:
34+
- Keep prompt set fixed between before/after runs.
35+
- Use same model and retrieval settings for fair comparison.
36+
- Store raw runs in JSONL for reproducibility.
37+
38+
---
39+
40+
## 3) Experimental Setup
41+
42+
### 3.1 Before (Baseline)
43+
- Validation disabled or bypassed.
44+
- Record model outputs and available context.
45+
46+
### 3.2 After (StillMe Enabled)
47+
- Validation enabled with selected mode.
48+
- Record decision, reason codes, and safe response.
49+
50+
### 3.3 Data Files
51+
- `data/before_<case_id>.jsonl`
52+
- `data/after_<case_id>.jsonl`
53+
- `reports/<case_id>_summary.md`
54+
55+
---
56+
57+
## 4) Mandatory Metrics
58+
59+
### 4.1 Hallucination Escape Rate
60+
61+
Definition:
62+
`unsupported_factual_answers_passed / total_high_risk_factual_prompts`
63+
64+
Interpretation:
65+
- Lower is better.
66+
- This is the primary risk metric.
67+
68+
### 4.2 Refusal Precision
69+
70+
Definition:
71+
`correct_refusals / total_refusals`
72+
73+
Interpretation:
74+
- Higher is better.
75+
- Measures whether refusals are justified instead of over-blocking.
76+
77+
### 4.3 Source Coverage
78+
79+
Definition:
80+
`factual_answers_with_valid_citation / total_factual_answers`
81+
82+
Interpretation:
83+
- Higher is better.
84+
- Measures evidence grounding quality.
85+
86+
---
87+
88+
## 5) Optional Supporting Metrics
89+
90+
- False refusal rate
91+
- Clarification usefulness rate
92+
- Mean validator confidence band distribution
93+
- Decision latency delta (before vs after)
94+
95+
---
96+
97+
## 6) Output Tables
98+
99+
### 6.1 Core Metrics Table
100+
101+
| Metric | Before | After | Delta | Target |
102+
|---|---:|---:|---:|---:|
103+
| Hallucination escape rate | | | | lower |
104+
| Refusal precision | | | | >= 0.85 |
105+
| Source coverage | | | | >= 0.80 |
106+
107+
### 6.2 Decision Distribution
108+
109+
| Decision | Before | After |
110+
|---|---:|---:|
111+
| answer | | |
112+
| refuse | | |
113+
| ask_clarify | | |
114+
115+
---
116+
117+
## 7) Error Analysis (Required)
118+
119+
Provide 5-10 representative examples:
120+
121+
1. **Escaped hallucination (before)**
122+
- Prompt:
123+
- Baseline answer:
124+
- Why unsafe:
125+
126+
2. **Correct refusal (after)**
127+
- Prompt:
128+
- StillMe decision/reason:
129+
- Why correct:
130+
131+
3. **False refusal (after, if any)**
132+
- Prompt:
133+
- Reason code:
134+
- Fix candidate:
135+
136+
4. **Citation quality improvement example**
137+
- Prompt:
138+
- Before citation state:
139+
- After citation state:
140+
141+
---
142+
143+
## 8) Go / No-Go Rule
144+
145+
Recommended internal gate for moving from `monitor` to `warn`:
146+
- Hallucination escape rate reduced by at least 50% vs before
147+
- Refusal precision >= 0.85
148+
- Source coverage >= 0.80 on factual subset
149+
150+
Recommended gate for moving from `warn` to `enforce`:
151+
- Metrics stable for 2 consecutive runs
152+
- No critical incident in sampled production traffic
153+
154+
---
155+
156+
## 9) Risk and Limitation Notes
157+
158+
- Known blind spots:
159+
- Data quality issues:
160+
- Retrieval mismatch observations:
161+
- Policy thresholds that need tuning:
162+
163+
Always include limitations. Do not claim universal safety guarantees.
164+
165+
---
166+
167+
## 10) Short Public Summary (Optional)
168+
169+
Use this 4-line format for release notes:
170+
171+
1. What was tested (scope and sample size)
172+
2. What improved (with numbers)
173+
3. What did not improve yet
174+
4. Next action for the next iteration
175+

0 commit comments

Comments
 (0)