Skip to content

feat(phase-2): narrative .docx memo + dual-output ycai report#15

Merged
RyanAlberts merged 1 commit into
mainfrom
phase-2-pr15-memo
May 2, 2026
Merged

feat(phase-2): narrative .docx memo + dual-output ycai report#15
RyanAlberts merged 1 commit into
mainfrom
phase-2-pr15-memo

Conversation

@RyanAlberts
Copy link
Copy Markdown
Owner

What

Phase 2 part 2: 9-section narrative .docx memo via python-docx. Same analytics.py math as the deck, same Layer 2 audit. ycai report now produces both deck.pptx and report.docx by default. With this PR, all of Phase 2 is shipped: depth=1 crawler + ECharts dashboard + PPT + DOCX, all anchored to a single source of chart math.

Sample artifact: examples/output/report-w26-pr15-2026-05-01.docx.

Memo structure

Per USER.md document-format discipline (narrative memos = 2-5 pages with appendices):

  1. Title + dateline
  2. Headline finding — one-paragraph summary, drift-checked
  3. Coverage methodology — Tier A/B/C breakdown, Layer 1+2 disclosure
  4. The agentic batch — capability × industry heatmap + analysis paragraph
  5. Industry distribution
  6. Tech stack and OSS posture — with the unknown caveat made explicit
  7. Six company spotlights — verbatim taglines, classification facts, rationale
  8. What we still cannot answer — three open questions
  9. Reproduce this memo — install + run instructions

Layer 2 audit refinements

The W26 memo's first build surfaced two real edge cases the auditor was right to flag:

  • "Winter 2026" — the year was being treated as numerical drift. Fixed by extending date-pattern stripping to YC-batch labels and bare 4-digit years flanked by non-digits.
  • "top three industries account for 53 of 113 companies" — the sum of 53 isn't in any base counter, but it's a legitimate derived total. Fixed by adding derived_sums (top-3, top-5) to extra_allowed so the auditor verifies them rather than rejects them as drift.

Both adjustments tighten correctness, not loosen it: the auditor still catches actually-invented numbers and flags forbidden phrases. The trap-resistance suite from PR #2 still passes unchanged.

Test plan

  • 149 tests passing (4 new for docx)
  • mypy --strict clean
  • make publish-check clean
  • W26 memo captured: 4 chart PNGs embedded, ~47 paragraphs, Layer 2 audit clean

Phase 2 status

PR What
#11 Depth=1 website crawl (B007) — OSS unknown 55% → 21%
#12 ECharts replaces static CSS bars
#14 VC-style .pptx deck + Layer 2 audit
#15 Narrative .docx memo + dual-output ycai report

Phase 3 (Chrome extension surface) remains on the v1.0 milestone. The infrastructure for it — the ycai daemon command — is already in v0.1.0; the extension UI hasn't been built yet.

What's next

If you want a v0.2.0 release tag, this is the moment. The Phase 2 surface is feature-complete for both deck and memo, the W26 example artifacts are checked in, and the babysitter routine still pauses cleanly on v0.1.0.

🤖 Generated with Claude Code

…udit

What ships
- src/ycai/reports/docx.py: 9-section narrative memo per USER.md
  document-format discipline. Title, headline, coverage methodology,
  the agentic batch (capability heatmap), industry distribution, tech
  stack + OSS posture, six company spotlights, unanswered questions,
  reproducibility footer. Same analytics.py math as the deck. Same
  Layer 2 audit pre-write.
- src/ycai/reports/anti_hallucination.py: date-pattern stripping
  extended to YC-batch labels ('Winter 2026') and bare 4-digit years
  flanked by non-digits. The drift checker no longer surfaces years
  as numerical drift.
- src/ycai/cli.py: 'ycai report <run-dir>' now produces both deck +
  memo by default. --deck-only / --memo-only constrain.
- pyproject.toml: python-docx>=1.1 dep, mypy override extended to
  ycai.reports.docx (same untyped-import situation as ppt).

Tests: 4 new docx tests (149 total). Validates the .docx is a valid
zip with word/document.xml, contains 'coverage' + 'agents' in the
body, embeds >=3 chart images, aborts on a forbidden phrase smuggled
into a company rationale, builds even with empty quote candidates.

Real W26 memo captured at examples/output/report-w26-pr15-2026-05-01.docx.
4 chart PNGs (capability heatmap, industry bar, OSS pie, tech stack
bar). ~47 paragraphs. Layer 2 audit clean on the real run.

Phase 2 of the project plan is now shipped: depth=1 crawler (PR #11),
ECharts dashboard (PR #12), .pptx deck (PR #14), .docx memo (PR #15).
Phase 3 (Chrome extension) lives at the v1.0 milestone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@RyanAlberts RyanAlberts merged commit 8df8c68 into main May 2, 2026
3 checks passed
@RyanAlberts RyanAlberts deleted the phase-2-pr15-memo branch May 2, 2026 02:28
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements narrative .docx memo generation, adding a new report builder in src/ycai/reports/docx.py and updating the CLI to support the format. The implementation includes integration with the anti-hallucination audit system and expanded regex for date stripping. Feedback identifies a risk of hallucination from hardcoded fallbacks in the prose and a bug in the audit allowlist that would cause failures due to missing section markers.

Comment thread src/ycai/reports/docx.py
Comment on lines +159 to +160
f"capabilities are {capability.most_common(2)[1][0] if len(capability) > 1 else 'rag'} and "
f"{capability.most_common(3)[2][0] if len(capability) > 2 else 'data-pipeline'}. The heatmap above "
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The fallbacks 'rag' and 'data-pipeline' are hardcoded and will be used if the analysis cohort has fewer than 3 distinct capabilities. This introduces a potential hallucination in the report itself, which contradicts the project's anti-hallucination goals. It is better to dynamically construct this sentence based on the available data.

Suggested change
f"capabilities are {capability.most_common(2)[1][0] if len(capability) > 1 else 'rag'} and "
f"{capability.most_common(3)[2][0] if len(capability) > 2 else 'data-pipeline'}. The heatmap above "
f"capabilities are {', and '.join(name for name, _ in capability.most_common(3)[1:]) if len(capability) > 1 else 'N/A'}. The heatmap above "

Comment thread src/ycai/reports/docx.py
sum(c for _, c in capability.most_common(3)),
sum(c for _, c in oss_posture.most_common(3)),
)
infra_facts: tuple[float, ...] = (4.6, 5, 30, 2, 1, *derived_sums)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The infra_facts allowlist contains magic numbers (4.6, 5, 30) that are not present in the .docx prose (unlike the .pptx methodology slide). Conversely, it is missing the number 3, which appears in the 'Unanswered questions' section as a list marker (3.). Since the drift auditor checks numbers >= 2, the absence of 3 in the allowlist will cause the Layer 2 audit to fail if no other data point happens to be exactly 3.

Suggested change
infra_facts: tuple[float, ...] = (4.6, 5, 30, 2, 1, *derived_sums)
infra_facts: tuple[float, ...] = (1, 2, 3, *derived_sums)

@RyanAlberts RyanAlberts mentioned this pull request May 2, 2026
RyanAlberts added a commit that referenced this pull request May 2, 2026
Phase 2 release. Cumulative changes since 0.1.0:
- PR #11: depth=1 polite website crawl, OSS posture unknown rate
  55% -> 21%, identified tech-stack mentions 14 -> 41
- PR #12: Apache ECharts replaces static CSS bars in dashboard
- PR #14: VC-style .pptx deck + anti-hallucination Layer 2
- PR #15: narrative .docx memo + dual-output ycai report

149 tests passing. Mypy strict clean. python -m build produces a
clean 0.2.0 wheel + sdist. CHANGELOG and README updated to reflect
shipped Phase 2 status.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant