RyanAlberts
diff --git a/‎.secrets.baseline‎
Lines changed: 1 addition & 1 deletion b/‎.secrets.baseline‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎BACKLOG.md‎
Lines changed: 2 additions & 0 deletions b/‎BACKLOG.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 2 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎examples/README.md‎
Lines changed: 1 addition & 0 deletions b/‎examples/README.md‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎examples/output/analyses-w26-smoke-2026-05-01.json‎
Lines changed: 115 additions & 0 deletions b/‎examples/output/analyses-w26-smoke-2026-05-01.json‎
Lines changed: 115 additions & 0 deletions
diff --git a/‎pyproject.toml‎
Lines changed: 2 additions & 0 deletions b/‎pyproject.toml‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎scripts/publish_check.sh‎
Lines changed: 1 addition & 1 deletion b/‎scripts/publish_check.sh‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎scripts/secret_scan.sh‎
Lines changed: 1 addition & 1 deletion b/‎scripts/secret_scan.sh‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎src/ycai/classifier.py‎
Lines changed: 78 additions & 0 deletions b/‎src/ycai/classifier.py‎
Lines changed: 78 additions & 0 deletions
@@ -127,5 +127,5 @@
     }
   ],
   "results": {},
-  "generated_at": "2026-05-01T19:00:38Z"
+  "generated_at": "2026-05-01T19:21:27Z"
 }
@@ -19,6 +19,8 @@ Promoted to GitHub issues when an item survives more than one PR. ADRs for non-t
 - [B003] CI annotations report Node 20 actions deprecated (forced to Node 24 from 2026-06-02). Refresh `actions/checkout`, `actions/setup-python`, `gitleaks/gitleaks-action` to Node-24-compatible majors before that date. — surfaced in: phase 0 CI run — proposed: ad-hoc PR before 2026-06-02
 - [B004] Tune `MIN_DESCRIPTION_CHARS` (currently 80). The W26 probe surfaced one borderline drop (`moda`, 57 chars). A small calibration study against borderline rows would let us pick a defensible threshold. — surfaced in: W26 quality probe — proposed: PR #2
 - [B005] Name the missing-from-upstream companies, not just count them. Compare yc-oss slugs to a slug list discovered from `/companies/<slug>` profile pages so the dropped register includes "Acme (in YC W26 but not in yc-oss/api)". — surfaced in: W26 quality probe — proposed: PR #2 or #3
+- [B006] Track schema-validation failure rate during enrichment as a tracked metric. The W26 smoke run had 1/5 (20%) parse failures (`velum-labs` — likely rationale exceeded the 400 char limit). Measure this across the full batch and tune prompt or schema if rate exceeds ~5%. — surfaced in: PR #2 smoke — proposed: PR #3
+- [B007] Tech-stack and OSS-posture nearly always come back as `unknown` because the model only sees the YC `long_description`, not the company website. Adding a depth=1 website crawl before the LLM call would let the model identify e.g. "this product is closed-source SaaS" or "uses OpenAI" — significantly improving Tier A signal density. Cost: ~5-10 KB extra context per company. — surfaced in: PR #2 smoke — proposed: PR #3
 
 ## Done
 
 
@@ -12,5 +12,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Phase 1 PR #1: yc-oss/api scraper, PII sanitizer, link verifier, coverage probe, single-file dashboard, Typer CLI (`ycai run-coverage`).
 - Coverage metric is the dashboard headline. The dropped register acknowledges every excluded company and the specific reason — no quiet drops.
 - First end-to-end probe on YC W26: 63.3% coverage of the official 196-company batch. Findings in `docs/QUALITY_REPORT_W26.md`.
+- Phase 1 PR #2: LLM-based enrichment with anti-hallucination Layer 1 — pydantic-enforced output schema, source-URL guard against fabricated citations, two-pass cross-check on uncertain rows, sentinel low-confidence row on any failure. Three backends: `AgentSDKBackend` (subscription-default), `AnthropicAPIBackend` (`--api-key`), `MockBackend` (tests). 10 hallucination-trap fixtures locked in as regression tests.
+- W26 enrichment smoke run (5 companies via subscription, 39s, ~free): 4 high / 1 low confidence. Identified `gru.space` as `no-ai` correctly. Schema-validation failure on `velum-labs` correctly fell through to the sentinel — no fabricated analysis served.
 
 [Unreleased]: https://github.com/RyanAlberts/yc-ai-pulse/compare/main...HEAD
@@ -6,6 +6,7 @@ Sanitized sample artifacts. Every commit goes through `make publish-check` so PI
 |---|---|
 | [`output/dashboard-w26-2026-05-01.html`](output/dashboard-w26-2026-05-01.html) | Phase 1 dashboard for YC W26. Headline: 63.3% coverage of the 196-company batch, with the dropped register naming every excluded company. |
 | [`output/coverage-w26-2026-05-01.json`](output/coverage-w26-2026-05-01.json) | Machine-readable coverage report — what feeds the dashboard. |
+| [`output/analyses-w26-smoke-2026-05-01.json`](output/analyses-w26-smoke-2026-05-01.json) | PR #2 smoke run: 5-company LLM enrichment via Sonnet 4.6 on subscription. Captures the schema-enforced output and demonstrates source-URL grounding (every cited URL is from `website` or YC profile). |
 
 The full quality writeup for W26 is in [`docs/QUALITY_REPORT_W26.md`](../docs/QUALITY_REPORT_W26.md).
 
 
@@ -0,0 +1,115 @@
+[
+  {
+    "slug": "bidflow",
+    "industry_primary": "Real Estate / Construction",
+    "industry_secondary": [
+      "B2B SaaS"
+    ],
+    "ai_capability": [
+      "rag",
+      "nlp-classic",
+      "agents"
+    ],
+    "tech_stack": [
+      "unknown"
+    ],
+    "oss_posture": "unknown",
+    "oss_evidence_url": null,
+    "tagline_rewrite": "AI copilot that automates electrical contractor RFP estimation, cutting bid prep time dramatically.",
+    "confidence": "high",
+    "sources": [
+      "https://www.ycombinator.com/companies/bidflow",
+      "https://usebidflow.com/"
+    ],
+    "rationale": "The description explicitly states the product helps 'electrical contractors do electrical estimates way faster using AI' and targets 'redundant paperwork' in RFP submission, confirming Real Estate/Construction primary with B2B SaaS delivery; YC tags 'AI Assistant' and 'SaaS' corroborate the classification."
+  },
+  {
+    "slug": "travo",
+    "industry_primary": "Real Estate / Construction",
+    "industry_secondary": [
+      "B2B SaaS",
+      "AI Infrastructure"
+    ],
+    "ai_capability": [
+      "rag",
+      "data-pipeline",
+      "agents"
+    ],
+    "tech_stack": [
+      "unknown"
+    ],
+    "oss_posture": "unknown",
+    "oss_evidence_url": null,
+    "tagline_rewrite": "AI-powered real estate data platform for RV parks and niche asset classes \u2014 comps, ownership, zoning, and financials in one place.",
+    "confidence": "high",
+    "sources": [
+      "https://www.ycombinator.com/companies/travo",
+      "https://www.travoai.com/"
+    ],
+    "rationale": "The description explicitly states they 'use AI to collect and analyze real estate data' and 'build the best informed AI applications for real estate private equity firms, developers, and brokers,' confirming a Real Estate / Construction primary with AI-driven data pipeline and RAG-style retrieval capabilities. No specific model or framework is mentioned, so tech_stack is unknown."
+  },
+  {
+    "slug": "galactic-resource-utilization-space-inc-gru-space",
+    "industry_primary": "Industrials",
+    "industry_secondary": [
+      "Real Estate / Construction",
+      "Consumer",
+      "Government / Defense"
+    ],
+    "ai_capability": [
+      "no-ai"
+    ],
+    "tech_stack": [],
+    "oss_posture": "unknown",
+    "oss_evidence_url": null,
+    "tagline_rewrite": "In-situ resource utilization to build pressurized lunar habitats, starting with a Moon hotel opening 2032.",
+    "confidence": "high",
+    "sources": [
+      "https://www.ycombinator.com/companies/galactic-resource-utilization-space-inc-gru-space",
+      "https://gru.space/"
+    ],
+    "rationale": "The description explicitly focuses on off-planet habitat construction using 'in-situ resource utilization technology, turning local material into building material,' with a roadmap through lunar and Martian infrastructure \u2014 squarely Industrials/Aviation and Space. No AI capabilities or tech stack are mentioned anywhere in the provided text."
+  },
+  {
+    "slug": "autumn-ai",
+    "industry_primary": "B2B SaaS",
+    "industry_secondary": [
+      "AI Infrastructure",
+      "Developer Tools"
+    ],
+    "ai_capability": [
+      "rag",
+      "data-pipeline",
+      "nlp-classic",
+      "agents"
+    ],
+    "tech_stack": [
+      "unknown"
+    ],
+    "oss_posture": "unknown",
+    "oss_evidence_url": null,
+    "tagline_rewrite": "Real-time buying signal intelligence platform that monitors web activity to surface high-intent prospects for GTM teams.",
+    "confidence": "high",
+    "sources": [
+      "https://www.ycombinator.com/companies/autumn-ai"
+    ],
+    "rationale": "The description explicitly states Autumn is 'building the first real-time signal intelligence platform for GTM teams' that 'monitors posts, commits, blogs, and announcements, surfacing buying signals,' indicating a B2B SaaS product with AI-driven data pipeline and NLP capabilities; no specific model providers or OSS artifacts are mentioned."
+  },
+  {
+    "slug": "velum-labs",
+    "industry_primary": "Unknown",
+    "industry_secondary": [],
+    "ai_capability": [
+      "unclear"
+    ],
+    "tech_stack": [],
+    "oss_posture": "unknown",
+    "oss_evidence_url": null,
+    "tagline_rewrite": "(no analysis: schema-validation-failure)",
+    "confidence": "low",
+    "sources": [
+      "https://github.com/RyanAlberts/yc-ai-pulse#unverifiable"
+    ],
+    "rationale": "Auto-generated low-confidence sentinel because: schema-validation-failure"
+  }
+]
@@ -21,6 +21,8 @@ dependencies = [
   "pydantic>=2.5",
   "typer>=0.12",
   "rich>=13",
+  "anthropic>=0.40",
+  "claude-agent-sdk>=0.1",
 ]
 
 [project.optional-dependencies]
 
@@ -26,7 +26,7 @@ fi
 #   - scripts/{secret_scan,publish_check}.sh (these files name the patterns)
 #   - tests/test_sanitizer.py (test fixtures must contain the patterns)
 SUSPICIOUS=$(git ls-files \
-  | grep -v -E '^(\.secrets\.baseline|scripts/secret_scan\.sh|scripts/publish_check\.sh|tests/test_sanitizer\.py)$' \
+  | grep -v -E '^(\.secrets\.baseline|scripts/secret_scan\.sh|scripts/publish_check\.sh|tests/test_sanitizer\.py|tests/test_researcher\.py)$' \
   | xargs grep -l -E -i 'sk-ant-[A-Za-z0-9_\-]{20,}|ghp_[A-Za-z0-9]{36}|AKIA[0-9A-Z]{16}' 2>/dev/null || true)
 if [[ -n "$SUSPICIOUS" ]]; then
   echo "❌ suspicious credential strings found in:"
 
@@ -36,7 +36,7 @@ else
     # - tests/test_sanitizer.py (test fixtures must contain the patterns
     #   they're testing redaction of). Reviewed manually — these are fake values.
     HITS=$(echo "$FILES" \
-      | grep -v -E '^(\.secrets\.baseline|scripts/secret_scan\.sh|tests/test_sanitizer\.py)$' \
+      | grep -v -E '^(\.secrets\.baseline|scripts/secret_scan\.sh|tests/test_sanitizer\.py|tests/test_researcher\.py)$' \
       | xargs grep -E -l "$pattern" 2>/dev/null || true)
     if [[ -n "$HITS" ]]; then
       echo "❌ pattern matched: $pattern"
 
@@ -0,0 +1,78 @@
+"""Taxonomies + deterministic prefilling.
+
+Where we can answer a classification question from yc-oss/api fields alone
+(without an LLM), we do. This:
+  1. saves Sonnet calls,
+  2. produces a deterministic answer auditors can re-derive,
+  3. reduces the surface area where the model can hallucinate.
+
+The LLM still classifies AI capability, tech stack, OSS posture, and the
+tagline — fields that can't be derived from YC's tag list.
+"""
+
+from __future__ import annotations
+
+from ycai.schemas import Industry
+
+# yc-oss industry / subindustry / tag substrings -> our enum.
+# Ordered most-specific first; first match wins.
+_INDUSTRY_RULES: tuple[tuple[str, Industry], ...] = (
+    ("ai infrastructure", Industry.AI_INFRASTRUCTURE),
+    ("developer tools", Industry.DEVELOPER_TOOLS),
+    ("dev tools", Industry.DEVELOPER_TOOLS),
+    ("security", Industry.SECURITY),
+    ("biotech", Industry.BIOTECH),
+    ("healthcare", Industry.HEALTHCARE),
+    ("medical", Industry.HEALTHCARE),
+    ("fintech", Industry.FINTECH),
+    ("financial", Industry.FINTECH),
+    ("legal", Industry.LEGAL),
+    ("education", Industry.EDUCATION),
+    ("real estate", Industry.REAL_ESTATE_CONSTRUCTION),
+    ("construction", Industry.REAL_ESTATE_CONSTRUCTION),
+    ("logistics", Industry.SUPPLY_CHAIN_LOGISTICS),
+    ("supply chain", Industry.SUPPLY_CHAIN_LOGISTICS),
+    ("climate", Industry.CLIMATE_ENERGY),
+    ("energy", Industry.CLIMATE_ENERGY),
+    ("robotics", Industry.ROBOTICS),
+    ("hardware", Industry.HARDWARE),
+    ("industrials", Industry.INDUSTRIALS),
+    ("government", Industry.GOVERNMENT_DEFENSE),
+    ("defense", Industry.GOVERNMENT_DEFENSE),
+    ("media", Industry.MEDIA_CONTENT),
+    ("content", Industry.MEDIA_CONTENT),
+    ("consumer", Industry.CONSUMER),
+    ("b2b", Industry.B2B_SAAS),
+    ("saas", Industry.B2B_SAAS),
+)
+
+
+def map_industry(yc_industry: str, yc_subindustry: str = "", yc_tags: list[str] | None = None) -> Industry:
+    """Map a yc-oss industry/subindustry/tags hint into our enum.
+
+    Returns ``Industry.UNKNOWN`` only if absolutely nothing matches — the LLM
+    can override our guess if it has a stronger signal from the website.
+    """
+    haystack = " ".join(
+        [yc_industry or "", yc_subindustry or "", " ".join(yc_tags or [])],
+    ).lower()
+    for needle, industry in _INDUSTRY_RULES:
+        if needle in haystack:
+            return industry
+    return Industry.UNKNOWN
+
+
+def industry_secondaries(yc_industry: str, yc_subindustry: str, yc_tags: list[str]) -> list[Industry]:
+    """Extra industry hits beyond the primary, from the same haystack.
+
+    Caps at 3 to keep the chart legible.
+    """
+    haystack = " ".join([yc_industry or "", yc_subindustry or "", " ".join(yc_tags or [])]).lower()
+    seen: list[Industry] = []
+    for needle, industry in _INDUSTRY_RULES:
+        if needle in haystack and industry not in seen:
+            seen.append(industry)
+    return seen[1:4]  # skip the primary (index 0), take next 3
+
+
+__all__ = ["industry_secondaries", "map_industry"]
Original file line number	Diff line number	Diff line change
`@@ -127,5 +127,5 @@`
`127`	`127`	`}`
`128`	`128`	`],`
`129`	`129`	`"results": {},`
`130`		`- "generated_at": "2026-05-01T19:00:38Z"`
	`130`	`+ "generated_at": "2026-05-01T19:21:27Z"`
`131`	`131`	`}`
Original file line number	Diff line number	Diff line change
`@@ -21,6 +21,8 @@ dependencies = [`
`21`	`21`	`"pydantic>=2.5",`
`22`	`22`	`"typer>=0.12",`
`23`	`23`	`"rich>=13",`
	`24`	`+ "anthropic>=0.40",`
	`25`	`+ "claude-agent-sdk>=0.1",`
`24`	`26`	`]`
`25`	`27`
`26`	`28`	`[project.optional-dependencies]`