Skip to content

Commit 21463eb

Browse files
RyanAlbertsclaude
andauthored
feat(phase-2): VC-style .pptx deck with anti-hallucination Layer 2 (#14)
* feat(phase-2): VC-style PPT deck with anti-hallucination Layer 2 What ships - src/ycai/analytics.py: pure-Python chart math, single source of truth shared by dashboard.py + reports/ppt.py. Counters, heatmap matrix, headline numbers, spotlight + quote selection. dashboard.py now imports from analytics; no number can drift between HTML and PPT. - src/ycai/reports/anti_hallucination.py: Layer 2 audit. Forbidden- phrase scanner (studies show, experts say, ...) + numerical-drift check. Date patterns stripped before number extraction. extra_allowed hook for stable infrastructure facts (model version, crawler caps). - src/ycai/reports/ppt.py: 16-slide a16z-feel deck. Original cream/ orange palette, sans/serif typography. matplotlib chart PNGs anchored to the same Counters the dashboard uses. Two prose streams audited separately — aggregate commentary gets the full Layer 2 audit; per- company taglines/rationales get forbidden-phrase only (Layer 1 already gated their source URLs). - src/ycai/cli.py: new 'ycai report <run-dir>' command produces deck.pptx from existing artifacts at zero LLM cost. W26 deck (16 slides): captured at examples/output/deck-w26-pr14-2026-05-01.pptx. 4 chart PNGs (industry bar, capability heatmap, tech stack bar, OSS pie). Layer 2 ran clean on the real run. Tests: 24 new (145 total). Analytics counters, heatmap shape, headline numbers including coverage, forbidden-phrase scanning, numerical drift with tolerance, date stripping, deck zip validity, slide count >= 12, layer-2 audit aborts on forbidden phrase, deck builds even with empty quote list. Phase 2 PR #15 (memo) is split out and lands next. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(phase-2): satisfy newer ruff RUF046+RUF059 in CI * fix(phase-2): mypy overrides for python-pptx + matplotlib (no published stubs) --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 6514f0c commit 21463eb

13 files changed

Lines changed: 1578 additions & 35 deletions

.secrets.baseline

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -142,9 +142,9 @@
142142
"filename": "src/ycai/dashboard.py",
143143
"hashed_secret": "3c09e03744a49c6020501c9b7ef6218ad440976e",
144144
"is_verified": false,
145-
"line_number": 46
145+
"line_number": 47
146146
}
147147
]
148148
},
149-
"generated_at": "2026-05-01T21:56:30Z"
149+
"generated_at": "2026-05-02T02:15:53Z"
150150
}

CHANGELOG.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
1515
### Changed
1616
- **PR #12 — Apache ECharts replaces static CSS bars in the dashboard.** Heatmap is now a real 2D heatmap with hover tooltips. Pie charts (confidence, OSS posture) render with proper labeling and click-to-isolate. Bar charts (industry, tech stack, YC tags, regions) get axis pointers and value tooltips. Loaded from CDN with SRI-pinned integrity hash; falls back to a `<noscript>` table if JS is disabled or the CDN is blocked. Each canvas carries `role="img"` + descriptive `aria-label`. All chart options ship as pure JSON in a `<script type="application/json">` block — no JS function strings, no client-side `eval`. 7 chart canvases, 121 tests passing.
1717

18+
### Phase 2 — reports
19+
- **PR #14 — VC-style `.pptx` deck with anti-hallucination Layer 2.** New `src/ycai/analytics.py` is the single source of chart math, consumed by both the dashboard (ECharts JSON) and the deck (matplotlib PNG). New `src/ycai/reports/ppt.py` builds a 16-slide deck (cream/orange palette, sans/serif typography). Each chart is a matplotlib PNG anchored to the same Counter the dashboard used. `ycai report <run-dir>` produces `deck.pptx` from existing artifacts at zero LLM cost. New `src/ycai/reports/anti_hallucination.py`: forbidden-phrase scan + numerical-drift check + date-pattern stripping. Two prose streams audited separately — aggregate commentary gets full drift check, per-company taglines/rationales get forbidden-phrase only (Layer 1 already gated their source URLs). 24 new tests (145 total).
20+
1821
## [0.1.0] — 2026-05-01
1922

2023
First publishable release. End-to-end pipeline that pulls the latest YC batch, classifies it with a Sonnet-class model under strict anti-hallucination guards, and renders a single-file HTML dashboard with row-level drill-downs.

examples/README.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,8 @@ Sanitized sample artifacts. Every commit goes through `make publish-check` so PI
44

55
| File | What |
66
|---|---|
7-
| [`output/dashboard-w26-pr12-2026-05-01.html`](output/dashboard-w26-pr12-2026-05-01.html) | **PR #12 dashboard — current best.** Same W26 data as PR #11 (depth=1 crawl, 113/124 high-confidence) but charts upgraded to Apache ECharts: real heatmap, pies with click-to-isolate, bars with axis tooltips. Loads via CDN with SRI; `<noscript>` fallback if JS disabled. |
7+
| [`output/deck-w26-pr14-2026-05-01.pptx`](output/deck-w26-pr14-2026-05-01.pptx) | **PR #14 VC-style deck — current best.** 16 slides, a16z-feel palette, matplotlib chart PNGs anchored to the same data the dashboard uses. Anti-hallucination Layer 2 ran before write (no forbidden phrases, no numerical drift). |
8+
| [`output/dashboard-w26-pr12-2026-05-01.html`](output/dashboard-w26-pr12-2026-05-01.html) | **PR #12 dashboard — current best HTML.** Same W26 data, ECharts canvases (real heatmap, pies, bars). |
89
| [`output/dashboard-w26-pr11-2026-05-01.html`](output/dashboard-w26-pr11-2026-05-01.html) | PR #11 dashboard with the depth=1 crawl but static CSS bars. Useful for comparing visual fidelity vs. PR #12. |
910
| [`output/analyses-w26-pr11-2026-05-01.json`](output/analyses-w26-pr11-2026-05-01.json) | Source data for both PR #11 and PR #12 dashboards. 113/124 high-confidence. |
1011
| [`output/dashboard-w26-pr4-2026-05-01.html`](output/dashboard-w26-pr4-2026-05-01.html) | PR #4 / v0.1.0 dashboard. Useful baseline (no crawl, 65 OSS-unknown rows; static CSS bars). |
271 KB
Binary file not shown.

pyproject.toml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@ dependencies = [
2323
"rich>=13",
2424
"anthropic>=0.40",
2525
"claude-agent-sdk>=0.1",
26+
"python-pptx>=1.0",
27+
"matplotlib>=3.7",
2628
]
2729

2830
[project.optional-dependencies]
@@ -81,6 +83,22 @@ warn_return_any = true
8183
warn_unreachable = true
8284
files = ["src/ycai"]
8385

86+
# Third-party libraries without published type stubs. We pin to the public
87+
# constructor surface and otherwise treat their return types as Any.
88+
[[tool.mypy.overrides]]
89+
module = ["pptx.*", "matplotlib.*"]
90+
ignore_missing_imports = true
91+
92+
# The deck builder is a thin wrapper over python-pptx whose API surface is
93+
# untyped (Presentation is a factory function, not a class). Keep strict
94+
# mode for everything else; relax just this module.
95+
[[tool.mypy.overrides]]
96+
module = "ycai.reports.ppt"
97+
disallow_untyped_calls = false
98+
disallow_untyped_defs = false
99+
warn_return_any = false
100+
disable_error_code = ["valid-type", "attr-defined", "no-any-return", "type-arg", "misc", "no-untyped-def"]
101+
84102
[tool.pytest.ini_options]
85103
testpaths = ["tests"]
86104
addopts = "-ra --strict-markers --strict-config"

src/ycai/analytics.py

Lines changed: 214 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,214 @@
1+
"""Pure-Python chart math, shared by every renderer.
2+
3+
Each function takes the validated CompanyAnalysis cohort plus optional context
4+
and returns plain Python data structures (Counters, dicts, lists). No HTML, no
5+
ECharts JSON, no matplotlib. The rendering layers (``dashboard.py`` for HTML,
6+
``reports/ppt.py`` for the deck, ``reports/docx.py`` for the memo) consume the
7+
output of this module.
8+
9+
This separation is what lets the deck and memo cite the *same* numbers as the
10+
dashboard. The numerical-drift check in ``reports/anti_hallucination.py`` walks
11+
the output of these functions and compares it to the prose in the deck/memo.
12+
"""
13+
14+
from __future__ import annotations
15+
16+
from collections import Counter, defaultdict
17+
from collections.abc import Iterable
18+
from dataclasses import dataclass
19+
20+
from ycai.schemas import (
21+
CompanyAnalysis,
22+
Industry,
23+
RawCompany,
24+
)
25+
26+
27+
@dataclass(frozen=True)
28+
class CapabilityHeatmap:
29+
"""Capability x industry matrix for the headline chart."""
30+
31+
capabilities: list[str] # row labels (Y axis)
32+
industries: list[str] # col labels (X axis)
33+
matrix: dict[tuple[str, str], int]
34+
total_keep: int # rows fed into the heatmap (high+medium confidence)
35+
36+
37+
def keep_for_charts(analyses: Iterable[CompanyAnalysis]) -> list[CompanyAnalysis]:
38+
"""Standard cohort: high + medium confidence only. Low excluded by design.
39+
40+
Every renderer should pass through this gate. If a chart is computed from
41+
something else (e.g. the YC tag distribution doesn't go through enrichment),
42+
it lives outside the LLM-derived analytics and uses different math.
43+
"""
44+
return [a for a in analyses if a.confidence in ("high", "medium")]
45+
46+
47+
def confidence_breakdown(analyses: Iterable[CompanyAnalysis]) -> Counter[str]:
48+
return Counter(a.confidence for a in analyses)
49+
50+
51+
def industry_distribution(analyses: Iterable[CompanyAnalysis]) -> Counter[str]:
52+
"""LLM-classified primary industry counts (high+medium only)."""
53+
keep = keep_for_charts(analyses)
54+
return Counter(a.industry_primary.value for a in keep)
55+
56+
57+
def capability_heatmap(analyses: Iterable[CompanyAnalysis]) -> CapabilityHeatmap:
58+
"""Capability x industry matrix. Rows are top capabilities by total count;
59+
columns are top industries by total count."""
60+
keep = keep_for_charts(analyses)
61+
matrix: dict[tuple[str, str], int] = defaultdict(int)
62+
for a in keep:
63+
for cap in a.ai_capability:
64+
matrix[(cap.value, a.industry_primary.value)] += 1
65+
cap_totals: Counter[str] = Counter()
66+
ind_totals: Counter[str] = Counter()
67+
for (cap_label, ind_label), v in matrix.items():
68+
cap_totals[cap_label] += v
69+
ind_totals[ind_label] += v
70+
top_caps = [name for name, _ in cap_totals.most_common(8)]
71+
top_inds = [name for name, _ in ind_totals.most_common(6)]
72+
cap_set = set(top_caps)
73+
ind_set = set(top_inds)
74+
restricted = {(c, i): v for (c, i), v in matrix.items() if c in cap_set and i in ind_set}
75+
return CapabilityHeatmap(
76+
capabilities=top_caps,
77+
industries=top_inds,
78+
matrix=restricted,
79+
total_keep=len(keep),
80+
)
81+
82+
83+
def capability_totals(analyses: Iterable[CompanyAnalysis]) -> Counter[str]:
84+
"""Flat capability counter (any company contributes to every capability it lists)."""
85+
keep = keep_for_charts(analyses)
86+
counter: Counter[str] = Counter()
87+
for a in keep:
88+
for cap in a.ai_capability:
89+
counter[cap.value] += 1
90+
return counter
91+
92+
93+
def tech_stack_distribution(analyses: Iterable[CompanyAnalysis]) -> Counter[str]:
94+
"""Tech stack counter. Empty stack list → counted under 'unknown' for clarity."""
95+
keep = keep_for_charts(analyses)
96+
counter: Counter[str] = Counter()
97+
for a in keep:
98+
if not a.tech_stack:
99+
counter["unknown"] += 1
100+
else:
101+
for stack in a.tech_stack:
102+
counter[stack.value] += 1
103+
return counter
104+
105+
106+
def oss_posture_distribution(analyses: Iterable[CompanyAnalysis]) -> Counter[str]:
107+
keep = keep_for_charts(analyses)
108+
return Counter(a.oss_posture.value for a in keep)
109+
110+
111+
def yc_tag_distribution(companies: Iterable[RawCompany]) -> Counter[str]:
112+
counter: Counter[str] = Counter()
113+
for c in companies:
114+
for tag in c.tags:
115+
counter[tag] += 1
116+
return counter
117+
118+
119+
def region_distribution(companies: Iterable[RawCompany]) -> Counter[str]:
120+
counter: Counter[str] = Counter()
121+
for c in companies:
122+
for region in c.regions:
123+
counter[region] += 1
124+
return counter
125+
126+
127+
def headline_numbers(
128+
analyses: Iterable[CompanyAnalysis],
129+
coverage: object | None = None,
130+
) -> dict[str, int]:
131+
"""Numbers any prose layer is allowed to cite. The drift checker compares
132+
every number in deck/memo prose against this dict.
133+
134+
When ``coverage`` is provided, also includes the coverage stats (tier
135+
counts, upstream/official counts) so deck/memo methodology slides can
136+
cite them.
137+
"""
138+
keep = keep_for_charts(analyses)
139+
by_conf = confidence_breakdown(analyses)
140+
cap_counts = capability_totals(analyses)
141+
out: dict[str, int] = {
142+
"total_analyses": sum(by_conf.values()),
143+
"high_confidence": by_conf["high"],
144+
"medium_confidence": by_conf["medium"],
145+
"low_confidence": by_conf["low"],
146+
"cohort_size": len(keep),
147+
"agents_count": cap_counts.get("agents", 0),
148+
"no_ai_count": cap_counts.get("no-ai", 0),
149+
"rag_count": cap_counts.get("rag", 0),
150+
}
151+
if coverage is not None:
152+
out["upstream_count"] = getattr(coverage, "upstream_company_count", 0)
153+
official = getattr(coverage, "yc_official_count", None)
154+
if official:
155+
out["yc_official_count"] = official
156+
out["tier_a_count"] = getattr(coverage, "tier_a_count", 0)
157+
out["tier_b_count"] = getattr(coverage, "tier_b_count", 0)
158+
out["tier_c_count"] = getattr(coverage, "tier_c_count", 0)
159+
out["analyzable_count"] = getattr(coverage, "analyzable_count", 0)
160+
# Percentages the deck/memo headlines reference.
161+
for pct_attr in ("coverage_pct_of_official", "coverage_pct_of_upstream"):
162+
value = getattr(coverage, pct_attr, None)
163+
if value is not None:
164+
out[pct_attr] = round(value)
165+
return out
166+
167+
168+
def spotlight_companies(analyses: Iterable[CompanyAnalysis], top_n: int = 6) -> list[CompanyAnalysis]:
169+
"""Pick the ``top_n`` most differentiated high-confidence companies.
170+
171+
Heuristic:
172+
- high confidence required
173+
- score = number of distinct AI capabilities + 2 if industry not 'B2B SaaS'
174+
(off-the-beaten-path industries are inherently more interesting)
175+
- tie-break by tagline length (longer = more substantive)
176+
"""
177+
candidates = [a for a in analyses if a.confidence == "high"]
178+
179+
def score(a: CompanyAnalysis) -> tuple[int, int]:
180+
diversity = len(set(c.value for c in a.ai_capability))
181+
off_beat = 0 if a.industry_primary == Industry.B2B_SAAS else 2
182+
return (diversity + off_beat, len(a.tagline_rewrite))
183+
184+
return sorted(candidates, key=score, reverse=True)[:top_n]
185+
186+
187+
def quote_candidates(analyses: Iterable[CompanyAnalysis]) -> list[CompanyAnalysis]:
188+
"""Companies whose tagline_rewrite is suitable for a pull quote.
189+
190+
The rewrite is short by design (<=140 chars), so we just pick the most
191+
striking high-confidence ones. The deck consumes this list and renders the
192+
pull-quote slide using ``tagline_rewrite`` verbatim — no second pass through
193+
the LLM.
194+
"""
195+
keep = [a for a in analyses if a.confidence == "high" and len(a.tagline_rewrite) >= 30]
196+
# Prefer taglines that actually *say something*: avoid the boring ones.
197+
return sorted(keep, key=lambda a: (-len(a.tagline_rewrite), a.slug))
198+
199+
200+
__all__ = [
201+
"CapabilityHeatmap",
202+
"capability_heatmap",
203+
"capability_totals",
204+
"confidence_breakdown",
205+
"headline_numbers",
206+
"industry_distribution",
207+
"keep_for_charts",
208+
"oss_posture_distribution",
209+
"quote_candidates",
210+
"region_distribution",
211+
"spotlight_companies",
212+
"tech_stack_distribution",
213+
"yc_tag_distribution",
214+
]

src/ycai/cli.py

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -115,6 +115,59 @@ def dashboard_cmd(
115115
console.print(f"[green]✓[/green] wrote dashboard.html → {out}")
116116

117117

118+
@app.command("report")
119+
def report_cmd(
120+
run_dir: Path = typer.Argument(..., help="Run directory with coverage.json + analyses.json(l)."),
121+
deck_only: bool = typer.Option(False, "--deck-only", help="Skip the .docx memo (Phase 2 PR #15)."),
122+
) -> None:
123+
"""Generate the .pptx deck (and .docx memo when shipped) from existing artifacts.
124+
125+
Anti-hallucination Layer 2 runs before any file is written:
126+
forbidden-phrase scan + numerical-drift check. Any violation aborts the
127+
build with the offending span so you can fix the prose.
128+
"""
129+
coverage_path = run_dir / "coverage.json"
130+
raw_path = run_dir / "raw" / "yc_companies.json"
131+
if not coverage_path.exists() or not raw_path.exists():
132+
console.print(f"[red]✗ {run_dir} doesn't look like a valid run directory[/red]")
133+
raise typer.Exit(2)
134+
135+
coverage = BatchCoverage.model_validate_json(coverage_path.read_text())
136+
companies = [RawCompany.model_validate(c) for c in json.loads(raw_path.read_text())]
137+
138+
analyses_jsonl = run_dir / "analyses.jsonl"
139+
analyses_json = run_dir / "analyses.json"
140+
if analyses_jsonl.exists():
141+
analyses = [
142+
CompanyAnalysis.model_validate_json(line)
143+
for line in analyses_jsonl.read_text().splitlines()
144+
if line.strip()
145+
]
146+
elif analyses_json.exists():
147+
analyses = [CompanyAnalysis.model_validate(a) for a in json.loads(analyses_json.read_text())]
148+
else:
149+
console.print(f"[red]✗ no analyses found in {run_dir}. Run with --enrich first.[/red]")
150+
raise typer.Exit(2)
151+
152+
from ycai.reports.ppt import Layer2Failure, build_deck
153+
154+
deck_path = run_dir / "deck.pptx"
155+
console.print("[cyan]→[/cyan] building deck.pptx (Layer 2 audit before write)…")
156+
try:
157+
build_deck(coverage, companies, analyses, output_path=deck_path)
158+
except Layer2Failure as exc:
159+
console.print(f"[red]✗ Layer 2 audit failed:[/red] {exc}")
160+
for hit in exc.forbidden[:5]:
161+
console.print(f" [red]forbidden phrase '{hit.phrase}':[/red] {hit.excerpt}")
162+
for drift in exc.drifts[:5]:
163+
console.print(f" [red]numerical drift '{drift.number}':[/red] {drift.excerpt}")
164+
raise typer.Exit(5) from exc
165+
console.print(f"[green]✓[/green] wrote {deck_path}")
166+
167+
if not deck_only:
168+
console.print("[yellow]⚠ .docx memo lands in PR #15.[/yellow]")
169+
170+
118171
@app.command("resume")
119172
def resume_cmd(
120173
run_dir: Path = typer.Argument(..., help="Run directory from a previous (partial) enrichment."),

0 commit comments

Comments
 (0)