feat(phase-2): narrative .docx memo + dual-output ycai report#15
Conversation
…udit
What ships
- src/ycai/reports/docx.py: 9-section narrative memo per USER.md
document-format discipline. Title, headline, coverage methodology,
the agentic batch (capability heatmap), industry distribution, tech
stack + OSS posture, six company spotlights, unanswered questions,
reproducibility footer. Same analytics.py math as the deck. Same
Layer 2 audit pre-write.
- src/ycai/reports/anti_hallucination.py: date-pattern stripping
extended to YC-batch labels ('Winter 2026') and bare 4-digit years
flanked by non-digits. The drift checker no longer surfaces years
as numerical drift.
- src/ycai/cli.py: 'ycai report <run-dir>' now produces both deck +
memo by default. --deck-only / --memo-only constrain.
- pyproject.toml: python-docx>=1.1 dep, mypy override extended to
ycai.reports.docx (same untyped-import situation as ppt).
Tests: 4 new docx tests (149 total). Validates the .docx is a valid
zip with word/document.xml, contains 'coverage' + 'agents' in the
body, embeds >=3 chart images, aborts on a forbidden phrase smuggled
into a company rationale, builds even with empty quote candidates.
Real W26 memo captured at examples/output/report-w26-pr15-2026-05-01.docx.
4 chart PNGs (capability heatmap, industry bar, OSS pie, tech stack
bar). ~47 paragraphs. Layer 2 audit clean on the real run.
Phase 2 of the project plan is now shipped: depth=1 crawler (PR #11),
ECharts dashboard (PR #12), .pptx deck (PR #14), .docx memo (PR #15).
Phase 3 (Chrome extension) lives at the v1.0 milestone.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request implements narrative .docx memo generation, adding a new report builder in src/ycai/reports/docx.py and updating the CLI to support the format. The implementation includes integration with the anti-hallucination audit system and expanded regex for date stripping. Feedback identifies a risk of hallucination from hardcoded fallbacks in the prose and a bug in the audit allowlist that would cause failures due to missing section markers.
| f"capabilities are {capability.most_common(2)[1][0] if len(capability) > 1 else 'rag'} and " | ||
| f"{capability.most_common(3)[2][0] if len(capability) > 2 else 'data-pipeline'}. The heatmap above " |
There was a problem hiding this comment.
The fallbacks 'rag' and 'data-pipeline' are hardcoded and will be used if the analysis cohort has fewer than 3 distinct capabilities. This introduces a potential hallucination in the report itself, which contradicts the project's anti-hallucination goals. It is better to dynamically construct this sentence based on the available data.
| f"capabilities are {capability.most_common(2)[1][0] if len(capability) > 1 else 'rag'} and " | |
| f"{capability.most_common(3)[2][0] if len(capability) > 2 else 'data-pipeline'}. The heatmap above " | |
| f"capabilities are {', and '.join(name for name, _ in capability.most_common(3)[1:]) if len(capability) > 1 else 'N/A'}. The heatmap above " |
| sum(c for _, c in capability.most_common(3)), | ||
| sum(c for _, c in oss_posture.most_common(3)), | ||
| ) | ||
| infra_facts: tuple[float, ...] = (4.6, 5, 30, 2, 1, *derived_sums) |
There was a problem hiding this comment.
The infra_facts allowlist contains magic numbers (4.6, 5, 30) that are not present in the .docx prose (unlike the .pptx methodology slide). Conversely, it is missing the number 3, which appears in the 'Unanswered questions' section as a list marker (3.). Since the drift auditor checks numbers >= 2, the absence of 3 in the allowlist will cause the Layer 2 audit to fail if no other data point happens to be exactly 3.
| infra_facts: tuple[float, ...] = (4.6, 5, 30, 2, 1, *derived_sums) | |
| infra_facts: tuple[float, ...] = (1, 2, 3, *derived_sums) |
Phase 2 release. Cumulative changes since 0.1.0: - PR #11: depth=1 polite website crawl, OSS posture unknown rate 55% -> 21%, identified tech-stack mentions 14 -> 41 - PR #12: Apache ECharts replaces static CSS bars in dashboard - PR #14: VC-style .pptx deck + anti-hallucination Layer 2 - PR #15: narrative .docx memo + dual-output ycai report 149 tests passing. Mypy strict clean. python -m build produces a clean 0.2.0 wheel + sdist. CHANGELOG and README updated to reflect shipped Phase 2 status. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
What
Phase 2 part 2: 9-section narrative
.docxmemo viapython-docx. Sameanalytics.pymath as the deck, same Layer 2 audit.ycai reportnow produces bothdeck.pptxandreport.docxby default. With this PR, all of Phase 2 is shipped: depth=1 crawler + ECharts dashboard + PPT + DOCX, all anchored to a single source of chart math.Sample artifact:
examples/output/report-w26-pr15-2026-05-01.docx.Memo structure
Per USER.md document-format discipline (narrative memos = 2-5 pages with appendices):
unknowncaveat made explicitLayer 2 audit refinements
The W26 memo's first build surfaced two real edge cases the auditor was right to flag:
derived_sums(top-3, top-5) toextra_allowedso the auditor verifies them rather than rejects them as drift.Both adjustments tighten correctness, not loosen it: the auditor still catches actually-invented numbers and flags forbidden phrases. The trap-resistance suite from PR #2 still passes unchanged.
Test plan
--strictcleanmake publish-checkcleanPhase 2 status
.pptxdeck + Layer 2 audit.docxmemo + dual-outputycai reportPhase 3 (Chrome extension surface) remains on the v1.0 milestone. The infrastructure for it — the
ycai daemoncommand — is already in v0.1.0; the extension UI hasn't been built yet.What's next
If you want a v0.2.0 release tag, this is the moment. The Phase 2 surface is feature-complete for both deck and memo, the W26 example artifacts are checked in, and the babysitter routine still pauses cleanly on
v0.1.0.🤖 Generated with Claude Code