Skip to content

Commit dac8e9b

Browse files
dshkolclaude
andcommitted
Add: Provenance-based fabrication prevention checklist
Research finding: Traditional forensic statistics (Benford's Law, terminal digit analysis) don't work for StatCan data due to rounding, suppression, and seasonal adjustment. New approach emphasizes: 1. Provenance - every number must trace to JSON source 2. Arithmetic verification - recalculate YoY/MoM, verify sums 3. Period matching - article period <= JSON period 4. Variation checks - component values must differ across months Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 151a680 commit dac8e9b

1 file changed

Lines changed: 34 additions & 17 deletions

File tree

  • .claude/skills/the-daily-generator

.claude/skills/the-daily-generator/SKILL.md

Lines changed: 34 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -32,28 +32,45 @@ Every number in the article MUST come from the JSON file:
3232

3333
**If a value isn't in the JSON, don't include it in the article.**
3434

35-
## Red Flags: Fabrication Detection
35+
## Fabrication Prevention Checklist
3636

37-
Before publishing ANY article, verify these checks pass:
37+
Before finalizing ANY article, verify all checks pass.
3838

39-
### 1. Subseries/Component Data
40-
- [ ] Does `subseries[]` array exist in JSON? If empty/missing → omit breakdown from article
41-
- [ ] Can you cite the exact JSON path for EACH component value?
42-
- [ ] If multiple months generated: are component values DIFFERENT across months? (Identical = fabricated)
39+
### 1. Provenance (every number has a source)
4340

44-
### 2. Provincial Data
45-
- [ ] Does `provincial[]` array exist in JSON? If empty/missing → omit provincial table
46-
- [ ] Can you cite the exact JSON path for EACH provincial value?
47-
- [ ] If multiple months generated: are provincial values DIFFERENT across months? (Identical = fabricated)
41+
**Principle:** Don't ask "does this look like real data?" — ask "can I trace every number to a verified source?"
4842

49-
### 3. YoY Calculations
50-
- [ ] Cross-validate: Calculate YoY manually from time_series values
51-
- [ ] Formula: (current_value - year_ago_value) / year_ago_value × 100
52-
- [ ] If calculated YoY differs from claimed YoY by >0.1pp → STOP and investigate
43+
- [ ] Headline figure: cite JSON path (e.g., `latest.yoy_pct_change = 2.2`)
44+
- [ ] Each chart data point: from `time_series[N].value`
45+
- [ ] Each table cell: from `subseries[N]` or `provincial[N]`
46+
- [ ] **If you cannot cite the source → DO NOT include the number**
5347

54-
### 4. Data-Article Period Match
55-
- [ ] Does JSON `metadata.reference_period` match the article's reference period?
56-
- [ ] If generating historical article: verify subseries/provincial data matches that period (not latest)
48+
### 2. Arithmetic Verification (math must be exact)
49+
50+
- [ ] Recalculate YoY from time_series: `(current - year_ago) / year_ago × 100`
51+
- [ ] Recalculate MoM from time_series: `(current - previous) / previous × 100`
52+
- [ ] If trade data: verify `balance = exports - imports` exactly
53+
- [ ] If components shown: verify they sum to total correctly
54+
- [ ] **If calculated value differs from claimed by >0.1pp → STOP and investigate**
55+
56+
### 3. Period Match (critical)
57+
58+
- [ ] Article reference period ≤ JSON `metadata.reference_period`
59+
- [ ] **If article period > JSON period → STOP, data doesn't exist yet**
60+
61+
**Example of failure:** Trade October 2025 article generated from September 2025 JSON → all figures fabricated.
62+
63+
### 4. Variation Check (for batch generation)
64+
65+
- [ ] Component values DIFFER across months generated
66+
- [ ] Provincial values DIFFER across months generated
67+
- [ ] **If values identical across months → you're copying stale data**
68+
69+
### 5. Subseries/Provincial Data Exists
70+
71+
- [ ] Does `subseries[]` array exist and have entries? If empty → omit breakdown
72+
- [ ] Does `provincial[]` array exist and have entries? If empty → omit provincial table
73+
- [ ] Can you cite exact JSON path for EACH breakdown value?
5774

5875
## Workflow
5976

0 commit comments

Comments
 (0)