Skip to content

feat(phase-1): enriched dashboard + cited-URL publish gate#8

Merged
RyanAlberts merged 1 commit into
mainfrom
phase-1-pr3-enriched-dashboard
May 1, 2026
Merged

feat(phase-1): enriched dashboard + cited-URL publish gate#8
RyanAlberts merged 1 commit into
mainfrom
phase-1-pr3-enriched-dashboard

Conversation

@RyanAlberts
Copy link
Copy Markdown
Owner

What

Phase 1 PR #3: dashboard now consumes the analyses.json produced by PR #2 and renders four LLM-derived charts on top of the coverage baseline. Cited-URL link verification becomes a hard publish gate.

Closes #3.

Why

PR #1 surfaced which companies we can analyze; PR #2 classifies them; PR #3 makes that classification visible without losing the credibility lever. Every chart still drills down to source rows. Every cited URL is re-verified before the dashboard ships.

How

Module Change
dashboard.py New analyses= arg. When provided, renders 4 LLM-derived charts. Heatmap is a CSS-only intensity table (capability × industry). OSS posture uses a stacked bar with green-to-red mapping. Coverage-only mode unchanged.
dashboard.collect_cited_urls / write_broken_links_report Publish-gate plumbing. Dedupes URLs across analyses; attributes each broken URL back to the slugs that cited it.
cli.py After enrichment, every cited URL goes through verifier.check_urls. Dead URLs → exit code 4 unless --allow-dead-links, which writes BROKEN_LINKS.md and surfaces a banner.
researcher.py _drop_unknown_industries: filter industry_secondary to enum members rather than fail the whole row when the model emits a reasonable-but-out-of-set category. Primary industry stays strict.

W26 full-batch run (the real test)

Ran ycai run-coverage --batch winter-2026 --yc-official-count 196 --enrich --allow-dead-links on subscription. 6 minutes, ~free.

Confidence

  • 83 high (67%) + 41 low (33%) out of 124 Tier A+B companies.
  • Of the 41 low: 29 schema-validation failures, 12 honest model lows, 0 hallucinated source URLs (the guard exists; nothing tripped it on this run).

Top finding

65% of high-confidence W26 companies build agents (54 of 83). Followed by nlp-classic (30), rag (26), data-pipeline (19). 8 companies correctly tagged no-ai — the trust signal.

OSS posture

Mostly unknown (45 of 83). Model is honest about not being able to determine OSS posture from a YC long_description. B007 (depth=1 website crawl) would shift these to real values; tracked.

Publish gate

3 cited URLs returned 4xx/5xx at publish time:

  • arzule.com — 429
  • maywoodai.com — 404
  • caretta.so — SSL handshake failure

Each named in BROKEN_LINKS-w26-2026-05-01.md. Without --allow-dead-links, the pipeline would have aborted with exit 4 — that's the gate.

Test plan

  • 92 tests passing (14 new). Coverage-only and enriched dashboard rendering, all OSS posture values, publish-gate banner behavior, drill-down presence, dropped register completeness.
  • Mypy --strict clean.
  • make publish-check clean.
  • Real-data smoke captured under examples/output/.

Anti-hallucination invariants preserved

  1. Numbers come from pandas, not the LLM. Every chart is computed from the validated DataFrame.
  2. Drill-down on every chart traces back to the rows that produced it.
  3. Cited URLs are verified before publish.
  4. Low-confidence rows excluded from charts; surfaced in methodology footer.
  5. industry_primary, ai_capability, tech_stack, oss_posture enums stay strict. Only industry_secondary got lenient parsing.

Backlog spawned

Acceptance

  • make validate-p0 green
  • make publish-check green
  • LLM-path invariants preserved
  • Live smoke run captured

🤖 Generated with Claude Code

Closes #3.

What ships
- src/ycai/dashboard.py: rewritten to take optional analyses=. With
  analyses, renders 4 LLM-derived charts in addition to the YC
  baseline:
    - confidence breakdown (high/medium/low stacked bar)
    - LLM industry distribution (excludes low-confidence rows)
    - AI capability x industry heatmap
    - tech stack signals
    - OSS posture stacked bar with green-to-red color mapping
  Each chart has a row-level drill-down via <details>.
- src/ycai/dashboard.py:collect_cited_urls + write_broken_links_report:
  the publish-gate plumbing.
- src/ycai/cli.py: after enrichment, every URL cited in any analysis
  is HEAD/GET-verified. If any return 4xx/5xx the dashboard is not
  written and exit code 4 is returned. --allow-dead-links overrides
  to write the dashboard with a loud banner plus a sidecar
  BROKEN_LINKS.md naming each dead URL and the slugs that cited it.
- src/ycai/researcher.py:_drop_unknown_industries: lenient parsing
  for industry_secondary only — primary stays strict. Models
  occasionally emit reasonable-but-out-of-set categories like
  'Productivity'; we drop those rather than failing the whole row.
- tests/test_dashboard.py: 14 new tests covering both modes,
  coverage banner, dropped register, drill-downs, every OSS posture
  value, and the publish-gate flow.

Live full-batch run on W26 (124 companies, ~6 min on subscription)
- 83 high / 41 low confidence (67%/33%)
- of low: 29 schema-validation failures, 12 honest model lows, 0
  hallucinated source URLs (the source-URL guard caught zero — it
  was unneeded on this run, but correctness invariant holds)
- top capability is agents at 54 (65% of high-confidence rows)
- 8 companies correctly classified no-ai despite being in YC
- OSS posture mostly 'unknown' (45) — model honestly admits gap
  rather than guessing. Predicted by B007 in BACKLOG.
- 3 cited URLs were dead at publish time; surfaced in
  BROKEN_LINKS-w26-2026-05-01.md, dashboard rendered with banner
  via --allow-dead-links

Sample artifacts checked in:
- examples/output/dashboard-w26-enriched-2026-05-01.html
- examples/output/analyses-w26-full-2026-05-01.json
- examples/output/BROKEN_LINKS-w26-2026-05-01.md

Quality findings written up in docs/QUALITY_REPORT_W26.md.
B008 added to BACKLOG: tighten schema-validation rate from 23% via
either lenient ai_capability/tech_stack parsing or tool_use schema
enforcement on the API backend.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@RyanAlberts RyanAlberts merged commit e78ab9f into main May 1, 2026
3 checks passed
@RyanAlberts RyanAlberts deleted the phase-1-pr3-enriched-dashboard branch May 1, 2026 20:53
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an enriched dashboard for the YC AI Pulse tool, adding AI-capability heatmaps, tech-stack distributions, and OSS-posture charts. It implements a "publish gate" that verifies cited URLs, preventing the generation of dashboards with dead links unless overridden by a new --allow-dead-links flag. Additionally, the PR includes detailed quality reports for the Winter 2026 batch, a lenient parser for secondary industries to reduce validation failures, and comprehensive tests for the new dashboard features. Feedback focuses on consolidating imports, utilizing the standard library's HTML escaping, and improving the naming of internal helper functions.

Comment thread src/ycai/cli.py
Comment on lines +20 to +26
from ycai.dashboard import (
collect_cited_urls,
write_broken_links_report,
)
from ycai.dashboard import (
render as render_dashboard,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The imports from ycai.dashboard are redundant and can be consolidated into a single block for better readability and maintainability.

from ycai.dashboard import (
    collect_cited_urls,
    render as render_dashboard,
    write_broken_links_report,
)

Comment thread src/ycai/dashboard.py

from __future__ import annotations

import json
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Add import html to support a more robust escaping mechanism.

Suggested change
import json
import html
import json

Comment thread src/ycai/dashboard.py
Comment on lines +38 to +39
def _escape(text: str) -> str:
return text.replace("&", "&amp;").replace("<", "&lt;").replace(">", "&gt;").replace('"', "&quot;")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The manual implementation of _escape is incomplete (e.g., it misses single quotes) and redundant. Using html.escape from the standard library is safer and more idiomatic.

Suggested change
def _escape(text: str) -> str:
return text.replace("&", "&amp;").replace("<", "&lt;").replace(">", "&gt;").replace('"', "&quot;")
def _escape(text: str) -> str:
return html.escape(text, quote=True)

Comment thread src/ycai/dashboard.py
f"LLM-derived charts use {n_high} high + {n_med} medium-confidence companies "
f"({n_low} low-confidence rows excluded). Each company sent to "
f"{_first_url(analyses)} via the configured backend with a strict pydantic "
"schema; sources must be from the company's website or YC profile."
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function _first_url is poorly named and misleading. It should be renamed to reflect its purpose of providing a model description.

Suggested change
"schema; sources must be from the company's website or YC profile."
f"{_get_model_description()} via the configured backend with a strict pydantic "

Comment thread src/ycai/dashboard.py
Comment on lines +558 to +564
def _first_url(analyses: list[CompanyAnalysis]) -> str:
"""Best-effort: return the model identifier from the first analysis if available.

Used in methodology text. Currently the schema doesn't carry the model
name on each row (it's a per-run constant), so we just say 'a Sonnet model'.
"""
return "a Sonnet model"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function name _first_url is misleading as it returns a model description, not a URL. Additionally, it accepts an analyses parameter that is completely ignored. Renaming it and removing the unused parameter improves clarity.

Suggested change
def _first_url(analyses: list[CompanyAnalysis]) -> str:
"""Best-effort: return the model identifier from the first analysis if available.
Used in methodology text. Currently the schema doesn't carry the model
name on each row (it's a per-run constant), so we just say 'a Sonnet model'.
"""
return "a Sonnet model"
def _get_model_description() -> str:
"""Return the model identifier used for enrichment.
Currently hardcoded as the schema doesn't carry the model name per row.
"""
return "a Sonnet model"

@RyanAlberts RyanAlberts mentioned this pull request May 1, 2026
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PR #3 — verifier + dashboard with link-verify hard gate

1 participant