You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Article Generated for Unreleased Period: Trade Oct 2025 case where
JSON had Sept data but article claimed Oct (fabricated figures)
- Percentage Fabrication from Dollar Values: Building permits case
where LLM invented % when only $ change was available
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: .claude/skills/the-daily-generator/references/data-workflow.md
+48Lines changed: 48 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -284,3 +284,51 @@ Never substitute synthetic data.
284
284
2. If time_series shows Oct 2024 = X and Oct 2025 = Y, verify (Y-X)/X × 100 matches claimed YoY
285
285
3. Be especially careful with small percentage changes (<1%)
286
286
4. Double-check decimal places: 0.04% ≠ 0.4% ≠ 4%
287
+
288
+
### Article Generated for Unreleased Period (Jan 2026)
289
+
290
+
**What happened**: International trade article claimed to cover October 2025, but JSON only contained September 2025 data. October data wasn't released until January 8, 2026.
291
+
292
+
**The evidence**:
293
+
- JSON `reference_period`: "2025-09"
294
+
- JSON `end_period`: "2025-09"
295
+
- JSON `fetched_at`: "2025-12-23"
296
+
- Article claimed: October 2025
297
+
- Official October release: 2026-01-08
298
+
299
+
**Result**: LLM fabricated internally-consistent but completely wrong October figures:
- Actual (released Jan 8): exports +2.1%, imports +3.4%, deficit $583M
302
+
303
+
**Root cause**: Article was requested for a period beyond what existed in the JSON. Without real data, LLM invented plausible-looking values that were self-consistent but externally wrong.
304
+
305
+
**Detection**:
306
+
- Internally consistent ≠ externally accurate
307
+
- Compare JSON `reference_period` against article's claimed reference period
308
+
- If article period > JSON period → data was fabricated
309
+
310
+
**Prevention**:
311
+
1.**NEVER generate articles for periods beyond `metadata.reference_period`**
312
+
2. Before generating, verify: `article_period <= JSON.metadata.reference_period`
313
+
3. If user requests future period, STOP and report: "Data not yet available"
314
+
4. Check StatCan release schedule before attempting to generate
315
+
316
+
### Percentage Fabrication from Dollar Values (Jan 2026)
317
+
318
+
**What happened**: Building permits article showed industrial component +12.5%, but source only had dollar change ("edged down $3.9 million").
319
+
320
+
**The evidence**:
321
+
- Source text: "industrial component edged down $3.9 million"
322
+
- Article claimed: Industrial +12.5%
323
+
- No base value was available to calculate percentage
324
+
325
+
**Root cause**: LLM invented a percentage when only absolute change was provided. Without the denominator, percentage cannot be calculated.
326
+
327
+
**Detection**:
328
+
- If source shows "$X change" but article shows "Y% change" → verify base value exists
329
+
- Percentage requires: (new - old) / old × 100. If "old" is unknown, percentage is fabricated.
330
+
331
+
**Prevention**:
332
+
1. Only show percentages when BOTH values (before and after) are available
333
+
2. If source only provides dollar change, report dollar change (not invented %)
334
+
3. Ask: "Can I calculate this percentage from values I have?" If no → don't show %
0 commit comments