Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 66 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
name: Release

on:
push:
tags:
- "v*"

permissions:
contents: read

jobs:
build:
name: Build wheel + sdist
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.11"
- name: Install build
run: python -m pip install --upgrade pip build
- name: Build distribution
run: python -m build
- name: Upload artifacts
uses: actions/upload-artifact@v4
with:
name: dist
path: dist/

publish:
name: Publish to PyPI (Trusted Publishing)
needs: build
runs-on: ubuntu-latest
# Trusted Publishing requires id-token: write. Configured on PyPI at
# https://pypi.org/manage/project/yc-ai-pulse/settings/publishing/
# against this repo + this workflow filename.
permissions:
id-token: write
environment:
name: pypi
url: https://pypi.org/p/yc-ai-pulse
steps:
- uses: actions/download-artifact@v4
with:
name: dist
path: dist/
- uses: pypa/gh-action-pypi-publish@release/v1

github-release:
name: Attach artifacts to GitHub release
needs: build
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- uses: actions/download-artifact@v4
with:
name: dist
path: dist/
- name: Upload to GitHub release
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
TAG_NAME: ${{ github.ref_name }}
REPO: ${{ github.repository }}
run: |
gh release upload "$TAG_NAME" dist/* --repo "$REPO" --clobber
52 changes: 40 additions & 12 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,44 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

_(no changes since 0.1.0)_

## [0.1.0] — 2026-05-01

First publishable release. End-to-end pipeline that pulls the latest YC batch, classifies it with a Sonnet-class model under strict anti-hallucination guards, and renders a single-file HTML dashboard with row-level drill-downs.

### Added
- Phase 0 bootstrap: MIT license, repo scaffolding, pre-commit + secret-scan, CI workflow, BACKLOG discipline, first ADR.
- Phase 1 PR #1: yc-oss/api scraper, PII sanitizer, link verifier, coverage probe, single-file dashboard, Typer CLI (`ycai run-coverage`).
- Coverage metric is the dashboard headline. The dropped register acknowledges every excluded company and the specific reason — no quiet drops.
- First end-to-end probe on YC W26: 63.3% coverage of the official 196-company batch. Findings in `docs/QUALITY_REPORT_W26.md`.
- Phase 1 PR #2: LLM-based enrichment with anti-hallucination Layer 1 — pydantic-enforced output schema, source-URL guard against fabricated citations, two-pass cross-check on uncertain rows, sentinel low-confidence row on any failure. Three backends: `AgentSDKBackend` (subscription-default), `AnthropicAPIBackend` (`--api-key`), `MockBackend` (tests). 10 hallucination-trap fixtures locked in as regression tests.
- W26 enrichment smoke run (5 companies via subscription, 39s, ~free): 4 high / 1 low confidence. Identified `gru.space` as `no-ai` correctly. Schema-validation failure on `velum-labs` correctly fell through to the sentinel — no fabricated analysis served.
- Phase 1 PR #3: enriched dashboard. AI capability x industry heatmap, tech-stack distribution, OSS-posture breakdown, and confidence breakdown — all with row-level drill-downs. Cited-URL link-verify hard gate before any artifact ships (override via `--allow-dead-links` writes a `BROKEN_LINKS.md` sidecar and shows a warning banner). Lenient parsing for `industry_secondary` so the model can emit reasonable categories without tanking the row.
- W26 full-batch enrichment via subscription (124 companies, ~6 min, ~free): 83 high / 41 low confidence. Top finding: **65% of high-confidence W26 companies (54 of 83) build agents**. 8 companies correctly classified as `no-ai` (the trust signal). 3 cited URLs caught dead at publish time and surfaced via the publish gate.
- Phase 1 PR #4: resilience + parser tightening. Lenient parsing extended to `ai_capability` (drop unknowns, fall back to `unclear`) and `tech_stack` (drop unknowns). `rationale` and `tagline_rewrite` truncate at the schema cap rather than fail the whole row. Raw failure capture (`raw_failures.jsonl`) for audit. Incremental writes to `analyses.jsonl` so partial state survives a crash. New `ycai resume <run-dir>` command resumes interrupted enrichment. New `ycai dashboard <run-dir>` re-renders the dashboard from existing artifacts at zero LLM cost. Live progress shows high/medium/low counts during enrichment.
- W26 full-batch re-run after PR #4: schema-validation failure rate dropped from 23% to **0%**. High-confidence rate went from 67% to **95%** (118 of 124). The "W26 is the agentic batch" finding strengthened to 58% of n=118. Quality writeup updated at `docs/QUALITY_REPORT_W26.md`.

[Unreleased]: https://github.com/RyanAlberts/yc-ai-pulse/compare/main...HEAD

**Phase 0 — bootstrap (PR #6 lineage starts here)**
- MIT license, repo scaffolding, pre-commit + secret-scan + gitleaks + custom Anthropic-key regex, CI workflow, BACKLOG discipline, first two ADRs (yc-oss/api as the only sanctioned source; localhost FastAPI deferred to Phase 3).

**Phase 1 — analysis pipeline**
- **PR #6 — coverage probe**: yc-oss/api scraper with hard-fail when upstream is unreachable (no scraping `ycombinator.com/companies?...` per [robots.txt](docs/decisions/0001-yc-data-source.md)). PII sanitizer (idempotent strip before disk and before any LLM call). Async link verifier. Coverage probe with three tiers (A: full / B: website unreachable / C: missing required field) and a dropped register that names every excluded company. Coverage % is the dashboard headline.
- **PR #7 — LLM enrichment with anti-hallucination Layer 1**: pydantic-enforced classification schema, three backends (AgentSDK / Anthropic API / Mock), source-URL grounding (the cited URL must come from the company's website or YC profile), two-pass cross-check on medium-confidence rows, sentinel low-confidence row on any failure. 10 hallucination-trap fixtures as regression tests.
- **PR #8 — enriched dashboard + cited-URL publish gate**: capability×industry heatmap, tech-stack distribution, OSS-posture breakdown, confidence breakdown. Each chart drills down to source rows. Cited URLs are HEAD/GET-verified before publish; `--allow-dead-links` writes a sidecar `BROKEN_LINKS.md` and surfaces a banner.
- **PR #9 — resilience + parser tightening**: schema-failure rate dropped 23% → 0%. Truncate-not-reject for verbose free-text fields (`rationale`, `tagline_rewrite`). Lenient parsing for `ai_capability` and `tech_stack`. Raw failure capture (`raw_failures.jsonl`). Incremental writes to `analyses.jsonl`. `ycai resume` recovers from interrupted runs. `ycai dashboard` re-renders from existing artifacts at zero LLM cost.

**Real W26 results captured under `examples/output/`:**
- 63.3% coverage of the 196-company batch (132 in upstream, 124 Tier A+B, 8 named drops, 4 Tier B with dead websites)
- 118 of 124 high-confidence (95%) on the LLM enrichment, 0 schema failures, 0 hallucinated source URLs
- **Top finding: 58% of high-confidence W26 companies build agents.** "W26 is the agentic batch" is now defensible with row-level evidence.

### Backlog status at release

| ID | Status | Note |
|---|---|---|
| B001 | resolved | yc-oss/api is sole source; ADR 0001 amended in PR #6 |
| B002 | open | Cloudflare cache-headroom check on `yc-oss.github.io/api/*` |
| B003 | open | Node 20 actions deprecated by 2026-06-02 — bump CI before then |
| B004 | open | Calibrate `MIN_DESCRIPTION_CHARS` against borderline rows |
| B005 | open | Name the missing-from-upstream W26 companies, not just count |
| B006 | resolved | Schema-validation rate measured + tuned in PR #9 |
| B007 | open | Depth=1 website crawl to recover `tech_stack` and `oss_posture` from `unknown` — biggest signal lever for v0.2 |
| B008 | resolved | (rationale-cap root cause shipped in PR #9) |

### Tests

103 tests passing. Mypy `--strict` clean. CI runs ruff, mypy, pytest, detect-secrets, gitleaks, and a custom credential-pattern sweep on every PR.

[Unreleased]: https://github.com/RyanAlberts/yc-ai-pulse/compare/v0.1.0...HEAD
[0.1.0]: https://github.com/RyanAlberts/yc-ai-pulse/releases/tag/v0.1.0
37 changes: 25 additions & 12 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,29 +29,42 @@ A Chrome extension wraps two flows: *(a)* analyze the whole batch, *(b)* deep-di

| Phase | Surface | Status |
|---|---|---|
| 0 | Repo bootstrap, secrets hygiene, CI | 🟡 in progress |
| 1 | CLI + dashboard | ⬜ planned |
| 0 | Repo bootstrap, secrets hygiene, CI | ✅ shipped |
| 1 | CLI + dashboard with anti-hallucination Layer 1 | ✅ **v0.1.0** |
| 2 | `.pptx` + `.docx` reports | ⬜ planned |
| 3 | Chrome extension | ⬜ planned |

See [BACKLOG.md](BACKLOG.md) for the working backlog and `docs/decisions/` for architecture decisions.
See [CHANGELOG.md](CHANGELOG.md) for what 0.1.0 includes, [BACKLOG.md](BACKLOG.md) for the working backlog, and `docs/decisions/` for architecture decisions.

## Quickstart (Phase 1, when shipped)
## Quickstart (v0.1.0)

```bash
pipx install yc-ai-pulse # or: uv tool install yc-ai-pulse
ycai run --depth quick # ~5 min on Claude Max subscription
ycai dashboard runs/2026-*-*/ # opens dashboard.html in your browser
```
pipx install yc-ai-pulse # or: uv tool install yc-ai-pulse

By default `yc-ai-pulse` uses the [Claude Agent SDK](https://github.com/anthropics/anthropic-sdk-python) against your Claude Max subscription. To pay-per-run instead:
# Coverage probe only (no LLM cost) — fetches the latest batch and shows
# what's analyzable. Headline: % of YC batch covered, with the dropped
# register naming every excluded company.
ycai run-coverage --batch winter-2026 --yc-official-count 196

```bash
# Full enrichment via your Claude Max subscription (~6 min on W26, ~free).
# 95% high-confidence rate. Renders the dashboard with capability heatmap,
# tech-stack and OSS-posture breakdowns. Refuses to write the dashboard
# if any cited URL is dead.
ycai run-coverage --batch winter-2026 --yc-official-count 196 --enrich

# Pay-per-token instead of subscription:
export ANTHROPIC_API_KEY=sk-ant-...
ycai run --depth full
ycai run-coverage --batch winter-2026 --yc-official-count 196 --enrich

# Resume an interrupted run (quota wall, crash, network blip):
ycai resume runs/2026-05-01-XXXXXX
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The placeholder date in the example command '2026-05-01' should be replaced with a generic placeholder or variable to avoid confusion for users running this command on different dates.

Suggested change
ycai resume runs/2026-05-01-XXXXXX
ycai resume runs/YYYY-MM-DD-XXXXXX


# Re-render the dashboard from existing artifacts at zero LLM cost
# (useful when the dashboard layout changes):
ycai dashboard runs/2026-05-01-XXXXXX
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The placeholder date in the example command '2026-05-01' should be replaced with a generic placeholder or variable to avoid confusion for users running this command on different dates.

Suggested change
ycai dashboard runs/2026-05-01-XXXXXX
ycai dashboard runs/YYYY-MM-DD-XXXXXX

```

See [`docs/`](docs/) for the full guide.
A real run on YC W26 is checked in as a working example: see [`examples/output/dashboard-w26-pr4-2026-05-01.html`](examples/output/dashboard-w26-pr4-2026-05-01.html). The full quality writeup is in [`docs/QUALITY_REPORT_W26.md`](docs/QUALITY_REPORT_W26.md).

## Anti-hallucination guarantees

Expand Down
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
[project]
name = "yc-ai-pulse"
version = "0.0.1"
version = "0.1.0"
description = "Open-source YC batch analyzer — VC-style report and dashboard for the most recent Y Combinator batch."
readme = "README.md"
requires-python = ">=3.11"
license = { file = "LICENSE" }
authors = [{ name = "Ryan Alberts", email = "ryan.a.alberts@gmail.com" }]
keywords = ["ycombinator", "yc", "ai", "analysis", "vc", "research", "dashboard"]
classifiers = [
"Development Status :: 2 - Pre-Alpha",
"Development Status :: 3 - Alpha",
"Intended Audience :: Developers",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3.11",
Expand Down
2 changes: 1 addition & 1 deletion src/ycai/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@
Phase 0 stub. Feature modules land in subsequent PRs.
"""

__version__ = "0.0.1"
__version__ = "0.1.0"
2 changes: 1 addition & 1 deletion tests/test_smoke.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@


def test_package_imports() -> None:
assert ycai.__version__ == "0.0.1"
assert ycai.__version__ == "0.1.0"


def test_cli_version_subcommand() -> None:
Expand Down
Loading