Skip to content

Commit 151a680

Browse files
dshkolclaude
andcommitted
Docs: Add 2 new failure modes to data-workflow.md
- Article Generated for Unreleased Period: Trade Oct 2025 case where JSON had Sept data but article claimed Oct (fabricated figures) - Percentage Fabrication from Dollar Values: Building permits case where LLM invented % when only $ change was available Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 640141c commit 151a680

1 file changed

Lines changed: 48 additions & 0 deletions

File tree

.claude/skills/the-daily-generator/references/data-workflow.md

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -284,3 +284,51 @@ Never substitute synthetic data.
284284
2. If time_series shows Oct 2024 = X and Oct 2025 = Y, verify (Y-X)/X × 100 matches claimed YoY
285285
3. Be especially careful with small percentage changes (<1%)
286286
4. Double-check decimal places: 0.04% ≠ 0.4% ≠ 4%
287+
288+
### Article Generated for Unreleased Period (Jan 2026)
289+
290+
**What happened**: International trade article claimed to cover October 2025, but JSON only contained September 2025 data. October data wasn't released until January 8, 2026.
291+
292+
**The evidence**:
293+
- JSON `reference_period`: "2025-09"
294+
- JSON `end_period`: "2025-09"
295+
- JSON `fetched_at`: "2025-12-23"
296+
- Article claimed: October 2025
297+
- Official October release: 2026-01-08
298+
299+
**Result**: LLM fabricated internally-consistent but completely wrong October figures:
300+
- Claimed: exports flat, imports +4.2%, deficit $2.6B
301+
- Actual (released Jan 8): exports +2.1%, imports +3.4%, deficit $583M
302+
303+
**Root cause**: Article was requested for a period beyond what existed in the JSON. Without real data, LLM invented plausible-looking values that were self-consistent but externally wrong.
304+
305+
**Detection**:
306+
- Internally consistent ≠ externally accurate
307+
- Compare JSON `reference_period` against article's claimed reference period
308+
- If article period > JSON period → data was fabricated
309+
310+
**Prevention**:
311+
1. **NEVER generate articles for periods beyond `metadata.reference_period`**
312+
2. Before generating, verify: `article_period <= JSON.metadata.reference_period`
313+
3. If user requests future period, STOP and report: "Data not yet available"
314+
4. Check StatCan release schedule before attempting to generate
315+
316+
### Percentage Fabrication from Dollar Values (Jan 2026)
317+
318+
**What happened**: Building permits article showed industrial component +12.5%, but source only had dollar change ("edged down $3.9 million").
319+
320+
**The evidence**:
321+
- Source text: "industrial component edged down $3.9 million"
322+
- Article claimed: Industrial +12.5%
323+
- No base value was available to calculate percentage
324+
325+
**Root cause**: LLM invented a percentage when only absolute change was provided. Without the denominator, percentage cannot be calculated.
326+
327+
**Detection**:
328+
- If source shows "$X change" but article shows "Y% change" → verify base value exists
329+
- Percentage requires: (new - old) / old × 100. If "old" is unknown, percentage is fabricated.
330+
331+
**Prevention**:
332+
1. Only show percentages when BOTH values (before and after) are available
333+
2. If source only provides dollar change, report dollar change (not invented %)
334+
3. Ask: "Can I calculate this percentage from values I have?" If no → don't show %

0 commit comments

Comments
 (0)