feat(phase-1): LLM enrichment with anti-hallucination Layer 1#7
Conversation
Closes #2. What ships - src/ycai/schemas.py: CompanyAnalysis pydantic model. Industry, AICapability, TechStack, OSSPosture closed-set enums. CrossCheckResult for the two-pass logic. - src/ycai/classifier.py: deterministic prefilling. Maps yc-oss tag soup to the Industry enum without an LLM, reducing the surface area where the model can hallucinate. - src/ycai/researcher.py: the analyze() pipeline plus three Backend implementations: * AgentSDKBackend (default) — claude-agent-sdk against Claude Max subscription, no API cost. * AnthropicAPIBackend — pay-per-token via --api-key or ANTHROPIC_API_KEY. Key never logged or written to disk. * MockBackend — deterministic test backend. - src/ycai/cli.py: --enrich flag opts companies through the enrichment pipeline. --enrich-limit caps for smoke runs. Rich progress UI. - tests/fixtures/hallucination_traps.json: 10 synthetic companies designed to bait the classifier (misleading names, suggestive but vague descriptions, source-URL bait, acronym confusion, etc.). - tests/test_classifier.py + tests/test_researcher.py: 78 tests total (40 new). Schema enforcement, source-URL guard, two-pass cross-check, trap-resistance, PII redaction-in-prompt verification, API-key no-leakage verification. Anti-hallucination Layer 1 invariants 1. Pydantic schema rejects empty sources (min_length=1 on the field). 2. Source URLs must originate from the company website or YC profile URL — fabricated citations downgrade the row to low confidence. 3. confidence=medium triggers a second independent pass; disagreement on industry_primary or oss_posture downgrades to low. 4. Any failure returns a sentinel low-confidence row that survives in the CSV but is excluded from charts. No silent drops. 5. PII is stripped from the prompt before the backend sees it (defense-in-depth even though yc-oss/api fields are public). Live smoke run on 5 W26 companies via subscription - 4 high / 1 low confidence in 39 seconds, ~free on subscription - gru.space correctly identified as no-ai (the model is willing to say a YC company is not an AI company) - velum-labs correctly fell through to low-confidence sentinel when schema validation failed (likely rationale exceeded 400 char limit - tracked as B006) - All cited sources came from inputs (website + YC profile only) - Captured at examples/output/analyses-w26-smoke-2026-05-01.json New backlog - B006: track schema-validation failure rate, tune prompt if >5% - B007: add depth=1 website crawl before LLM call to recover tech stack and OSS posture (currently come back as 'unknown' for most companies because the YC long_description doesn't mention them) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request implements Phase 1 PR #2, introducing LLM-based enrichment for company data with a focus on anti-hallucination. It adds a researcher module supporting multiple backends (Claude Agent SDK, Anthropic API, and Mock), a deterministic industry classifier, and updated CLI functionality with an --enrich flag. Feedback focuses on hardening the anti-hallucination guards, specifically addressing a prefix-based URL validation vulnerability and potential crashes from malformed LLM responses. Other suggestions include improving logging for backend failures, replacing magic numbers with constants, and aligning Pydantic schema constraints with the LLM prompt instructions.
| def _looks_like_input_url(url: str, company: RawCompany) -> bool: | ||
| """Defense against hallucinated source URLs — a source must come from the inputs.""" | ||
| allowed = [company.website, company.url] | ||
| return any(url.startswith(allowed_url.rstrip("/")) for allowed_url in allowed if allowed_url) |
There was a problem hiding this comment.
The current implementation of _looks_like_input_url is vulnerable to prefix-based hallucinations. For example, if the allowed website is https://company.com, a source URL like https://company.com.attacker.com would be accepted because it starts with the allowed prefix. This bypasses the anti-hallucination grounding layer.
def _looks_like_input_url(url: str, company: RawCompany) -> bool:
"""Defense against hallucinated source URLs — a source must come from the inputs."""
allowed = [company.website, company.url]
for base in allowed:
if not base:
continue
base = base.rstrip("/")
# Ensure the URL is exactly the base or a sub-path/query/fragment
if url == base or any(url.startswith(base + c) for c in ("/", "?", "#")):
return True
return False| payload = json.loads(match.group(0)) | ||
| except json.JSONDecodeError: | ||
| return None | ||
| payload.setdefault("slug", slug) # the model sometimes drops the slug |
There was a problem hiding this comment.
The json.loads call can return types other than a dictionary (e.g., a list or a string if the model outputs valid JSON of those types). Calling .setdefault() on a non-dict object will raise an AttributeError, causing the pipeline to crash instead of falling back to the sentinel row.
try:
payload = json.loads(match.group(0))
except json.JSONDecodeError:
return None
if not isinstance(payload, dict):
return None
payload.setdefault("slug", slug) # the model sometimes drops the slug| messages=[{"role": "user", "content": prompt}], | ||
| ) | ||
| except Exception as exc: | ||
| log.warning("AnthropicAPIBackend failure: %s", exc.__class__.__name__) |
There was a problem hiding this comment.
Logging only the exception class name (exc.__class__.__name__) makes debugging difficult when a backend fails. It is better to include the full exception message to understand why the request failed (e.g., authentication error vs. rate limit).
| log.warning("AnthropicAPIBackend failure: %s", exc.__class__.__name__) | |
| log.warning("AnthropicAPIBackend failure: %s", exc) |
| model: str, | ||
| ) -> list[CompanyAnalysis]: | ||
| """Drive the enrichment pipeline with a Rich progress bar.""" | ||
| semaphore = asyncio.Semaphore(8) # respect subscription rate limits |
| oss_posture: OSSPosture | ||
| oss_evidence_url: HttpUrl | None = None |
There was a problem hiding this comment.
The prompt in researcher.py specifies that oss_evidence_url is mandatory if the posture is not 'closed', 'api-only', or 'unknown'. However, this constraint is not enforced in the Pydantic model. Adding a model validator would ensure that the Layer 1 anti-hallucination layer correctly rejects responses that violate this rule.
CI mypy job didn't have either SDK installed because they were never listed in pyproject.toml deps — they were just available locally. Adding them as runtime deps so CI installs them via pip install -e .[dev]. Both are small wheels and the package needs at least one of them at runtime (anthropic for the API path, claude-agent-sdk for the subscription path), so requiring them is correct, not bloat. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
What
Phase 1 PR #2: LLM-driven classification of YC companies on top of PR #1's coverage probe. Three backends (Agent SDK default for subscription, Anthropic API for pay-per-token, Mock for tests) all return validated
CompanyAnalysisobjects with full Layer 1 anti-hallucination enforcement.Closes #2.
Why
PR #1 surfaces which companies we can analyze. PR #2 actually classifies them — industry, AI capability, tech stack, OSS posture. The classification has to be cheap, schema-strict, and resistant to model fabrication; otherwise the eventual VC report is no better than vibes.
How
schemas.CompanyAnalysissourcesfield hasmin_length=1— pydantic rejects empty.classifier.map_industryIndustryenum, no LLM.researcher.Backendresearcher.analyzecli.run_coverage --enrich--enrich-limitfor smoke runs.--api-keyoverrides subscription mode. Rich progress UI.Anti-hallucination Layer 1 — what's guaranteed
CompanyAnalysisrejects any model response missing required fields, with empty sources, or with invalid enum values._validate_sourcesrejects citations that don't start withcompany.websiteorcompany.url. Fabricated URLs downgrade the row toconfidence=low.confidence=mediumrows trigger a second independent classification pass. Disagreement onindustry_primaryoross_posturedowngrades tolow. Agreement upgrades tohigh.strip_piiruns over name / one_liner / long_description before they reach the model.Live smoke run on real W26 data
Ran
ycai run-coverage --enrich --enrich-limit 5via subscription. 5 companies enriched in 39 seconds, ~free.bidflowtravogru.spaceautumn-aivelum-labsEvery cited source URL came from the inputs (website or YC profile). Zero fabricated citations.
Sample output checked in:
examples/output/analyses-w26-smoke-2026-05-01.json.Test plan
hallucination_traps.jsonlands atconfidence=lowwhen the backend returns nothing OR when it returns a fabricated source URL.--api-keyvalue never appears inrepr, public attrs, or any string representation of the backend.make publish-checkgreen.--strictgreen.Backlog spawned
unknownfor most companies because the model only sees the YClong_description. A depth=1 website crawl before classification would dramatically improve signal density.Acceptance
make validate-p0greenmake publish-checkgreen🤖 Generated with Claude Code