diff --git a/AGENTS.md b/AGENTS.md
deleted file mode 100644
index 1c68b62..0000000
--- a/AGENTS.md
+++ /dev/null
@@ -1,105 +0,0 @@
-# AGENTS.md
-
-## Project Snapshot
-
-Nuggetizer is a Python package for **information nugget creation, scoring, and assignment** for RAG evaluation. Three-stage pipeline:
-
-1. **Creation** — Extract atomic nuggets (1–12 word facts) from candidate documents.
-2. **Scoring** — Label each nugget `vital` or `okay`.
-3. **Assignment** — Determine whether answer text `support`s, `partial_support`s, or `not_support`s each nugget.
-
-Source: `src/nuggetizer/` with subpackages `core`, `models`, `prompts`, `utils`.
-
-## Architecture
-
-```
-Request (Query + Documents)
-  → Nuggetizer.create()
-    → [windowed] creator prompt → LLM → ast.literal_eval
-    → [windowed] scorer prompt  → LLM → ast.literal_eval
-  → List[ScoredNugget]
-  → Nuggetizer.assign(query, context, nuggets)
-    → [windowed] assigner prompt → LLM → ast.literal_eval
-  → List[AssignedScoredNugget]
-  → calculate_nugget_scores() → NuggetMetrics
-```
-
-`Nuggetizer` in `models/nuggetizer.py` is the central orchestrator — owns three LLM handlers (creator, scorer, assigner) and delegates prompt construction to `prompts/`.
-
-## Module Layout
-
-- `core/types.py` — All dataclasses and enums
-- `core/base.py` — ABC + `@runtime_checkable` Protocol contracts
-- `core/llm.py` / `core/async_llm.py` — Sync and async LLM handlers (OpenAI SDK wrappers)
-- `core/metrics.py` — Nugget scoring math
-- `models/nuggetizer.py` — Main `Nuggetizer` class (public API)
-- `prompts/template_loader.py` — YAML template loading + caching
-- `prompts/*_prompts.py` — Prompt builders for each stage
-- `prompts/prompt_templates/*.yaml` — The actual prompt text
-- `utils/api.py` — Env-var loaders for API keys
-- `utils/display.py` — Pretty-printing utilities
-- `scripts/` — CLI pipeline (create, assign, metrics)
-- `examples/` — End-to-end sync and async demos
-
-## Key Patterns
-
-1. **Dataclass hierarchy** — `BaseNugget → Nugget | ScoredNugget → AssignedScoredNugget`. Use dataclasses for domain objects, not plain dicts.
-2. **`ast.literal_eval` for LLM parsing** — LLMs return Python list literals. Always parse with `ast.literal_eval`, never `eval()`. Response cleaning strips markdown code fences before parsing.
-3. **Windowed processing** — Documents/nuggets are chunked into configurable windows (default size 10) for LLM calls.
-4. **Temperature escalation** — Starts at `temperature=0.0`; bumps to `0.2` on parse failure.
-5. **Graceful degradation** — Scoring failures default to `importance="okay"`, assignment failures to `assignment="failed"`.
-6. **Lazy async init** — Async LLM clients created on first async call via `_ensure_async_llm()`.
-7. **Round-robin key rotation** — API keys stored as a list, rotated on failure.
-8. **Resume support in scripts** — Scripts read existing output to skip already-processed entries.
-
-## LLM Handlers
-
-Four providers supported — Azure OpenAI (`"azure"`), OpenAI (`"openai"`), OpenRouter (`"openrouter"`), vLLM (`"vllm"`) — all via the OpenAI SDK with different configs.
-
-**Sync/async asymmetries to watch:** The sync handler (`llm.py`) has 5 retries, 4096 max tokens, 60s timeout, vLLM param branching, and content filter abort. The async handler (`async_llm.py`) retries infinitely, uses 2048 max tokens, 30s timeout, and lacks vLLM branching and content filter handling. If you fix a bug in one, check the other.
-
-**Special model handling:** Models starting with `o1`, `o3`, `o4`, or `gpt-5` collapse system messages into the user message and force `temperature=1.0`.
-
-## Prompt System
-
-- Prompts are YAML templates in `prompts/prompt_templates/`. Each has `system_message` and `prefix_user` with `str.format()` placeholders.
-- Always edit YAML templates, never hard-code prompt text in Python.
-- Templates are cached at module level. New templates must match the `*.yaml` glob in `pyproject.toml` package-data.
-- Keep template format variables in sync with the Python prompt builder that passes them.
-
-## Environment & Secrets
-
-API keys loaded from `.env` via `python-dotenv` (legacy keys from `.env.local`). Never hard-code keys.
-
-- Azure: `AZURE_OPENAI_API_BASE`, `AZURE_OPENAI_API_VERSION`, `AZURE_OPENAI_API_KEY`
-- OpenAI: `OPEN_AI_API_KEY` (or `OPENAI_API_KEY`)
-- OpenRouter: `OPENROUTER_API_KEY`
-- vLLM: No auth needed
-
-## Tooling
-
-- Python `>=3.10`. All new code must be fully typed.
-- Pre-commit enforces `ruff check --fix`, `ruff format`, and `mypy` (strict settings in `pyproject.toml`).
-- Run `pre-commit run --all-files` before committing.
-- Version bumps via `bumpver` — updates `pyproject.toml` and `README.md`.
-
-## Scripts
-
-Sequential pipeline for TREC RAG Track evaluation:
-1. `create_nuggets.py` — Extract + score nuggets from query/document JSONL
-2. `assign_nuggets.py` — Assign nuggets to RAG answer JSONL
-3. `assign_nuggets_retrieve_results.py` — Variant for individual retrieved segments
-4. `calculate_metrics.py` — Compute per-query and global metrics
-
-All scripts use `argparse`, support resume via output file scanning, handle errors per-record with `continue`, and flush output immediately.
-
-## Changes Checklist
-
-- [ ] New modules under `src/nuggetizer/` and registered in `pyproject.toml` packages if needed
-- [ ] `__init__.py` and `__all__` updated for new public API
-- [ ] Prompt changes in YAML templates, not Python; format variables match prompt builders
-- [ ] All functions fully typed
-- [ ] `pre-commit run --all-files` passes
-- [ ] Changes to `core/llm.py` checked against `core/async_llm.py` (and vice versa)
-- [ ] Scripts maintain resume support
-- [ ] No hard-coded API keys
diff --git a/AGENTS.md b/AGENTS.md
new file mode 120000
index 0000000..681311e
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1 @@
+CLAUDE.md
\ No newline at end of file
diff --git a/CLAUDE.md b/CLAUDE.md
deleted file mode 120000
index 47dc3e3..0000000
--- a/CLAUDE.md
+++ /dev/null
@@ -1 +0,0 @@
-AGENTS.md
\ No newline at end of file
diff --git a/CLAUDE.md b/CLAUDE.md
new file mode 100644
index 0000000..4781ac9
--- /dev/null
+++ b/CLAUDE.md
@@ -0,0 +1,70 @@
+# Repo Project Instructions
+
+## Scope
+- Repository: `castorini/nuggetizer`
+- Primary language: Python 3.10+
+- Purpose: create/score/assign factual nuggets for RAG evaluation using LLM backends (OpenAI, Azure OpenAI, OpenRouter, vLLM).
+
+## Project Layout
+- `src/nuggetizer/models/nuggetizer.py`: main orchestration (`Nuggetizer`) for create/score/assign.
+- `src/nuggetizer/core/`: core types, metrics, sync/async LLM handlers, base protocols.
+- `src/nuggetizer/prompts/`: prompt builders and YAML prompt templates.
+- `scripts/`: CLI pipelines for dataset-scale JSONL processing.
+- `examples/`: end-to-end usage examples (sync and async).
+- `docs/`: assets only (logo currently).
+
+## Packaging And Environment
+- Build backend: `setuptools.build_meta` via `pyproject.toml`.
+- Dependencies are dynamic and sourced from `requirements.txt`.
+- Install for development with `pip install -e .`.
+- Recommended local environment from README: conda env with Python 3.10.
+
+## LLM Provider Conventions
+- API keys are loaded from `.env` by `src/nuggetizer/utils/api.py`.
+- Supported env vars:
+  - OpenAI: `OPEN_AI_API_KEY` or `OPENAI_API_KEY`
+  - OpenRouter: `OPENROUTER_API_KEY`
+  - Azure OpenAI: `AZURE_OPENAI_API_BASE`, `AZURE_OPENAI_API_VERSION`, `AZURE_OPENAI_API_KEY`
+- Keep provider fallback behavior intact in `LLMHandler`/`AsyncLLMHandler`:
+  - OpenAI first when available, OpenRouter fallback when enabled/available.
+  - vLLM uses local base URL (`http://localhost:<port>/v1`) with placeholder key.
+
+## Coding Standards
+- Formatting/linting/type checks are enforced by pre-commit:
+  - Ruff (`ruff-check --fix`, `ruff-format`)
+  - MyPy (strict-ish config in `pyproject.toml`)
+- Run before committing:
+  - `pre-commit run --all-files`
+- Type hints are expected for new/changed code (`disallow_untyped_defs = true`).
+- Preserve dataclass and Enum-based type contracts in `core/types.py`.
+
+## CI And Contribution Workflow
+- PR CI (`.github/workflows/pr-format.yml`) runs on PRs to `main`.
+- CI currently validates style/type only via pre-commit (ruff + mypy).
+- No dedicated automated test suite is present; validate behavior using examples/scripts locally.
+
+## Validation Commands
+- Lint/type:
+  - `pre-commit run --all-files`
+- Quick smoke checks:
+  - `python3 examples/e2e.py --help`
+  - `python3 examples/async_e2e.py --help`
+  - `python3 scripts/create_nuggets.py --help`
+  - `python3 scripts/assign_nuggets.py --help`
+  - `python3 scripts/calculate_metrics.py --help`
+
+## Data And Pipeline Expectations
+- `scripts/create_nuggets.py` expects JSONL records with `query` and `candidates`.
+- `scripts/assign_nuggets.py` joins nugget JSONL with answer JSONL (`topic_id` mapping).
+- `scripts/calculate_metrics.py` computes per-record and global metrics from assignments.
+- Scripts append to output JSONL in some paths; avoid accidental duplicate processing.
+
+## Change Guidelines
+- Keep public constructor behavior stable in `Nuggetizer` (model args, provider flags, window/max controls).
+- Avoid breaking JSONL schemas produced by `scripts/` unless all downstream consumers are updated.
+- When editing prompt templates, verify prompt loader paths and assignment/score label compatibility.
+- Preserve retry and key-rotation logic in LLM handlers unless intentionally redesigning error handling.
+
+## Versioning
+- Version is defined in `pyproject.toml` (`project.version`) and managed with `bumpver` config.
+- If doing a release bump, update versioned references consistently per bumpver patterns.