Claude Deep Research

A production-tested research configuration for Claude Code that turns it into a multi-source research agent capable of producing 100+ source reports with citation tracking, source verification, and confidence scoring.

What This Is

This repo contains the complete configuration I use daily with Claude Code (Opus 4.6) for deep research. It orchestrates 6 search engines in parallel through MCP servers, with each engine assigned a specific role to avoid query overlap:

Engine	Role	Cost
Brave	Broad discovery, news, trend scanning	Free (2K queries/mo)
Exa	Semantic deep-dives after initial discovery	Paid
Tavily	Factual verification with structured citations	Free tier available
Perplexity	AI synthesis and contradiction resolution	Paid (most expensive)
Firecrawl	Full-page content extraction from top URLs	Free tier available
Context7	Library/API documentation lookup	Free

Plus supplementary servers for YouTube transcripts, academic papers (arXiv, PubMed), browser automation (Playwright), and more.

Architecture

User triggers "/deep-research" or says "deep research"
  │
  ▼
SKILL.md activates 6-phase protocol (100+ source minimum)
  │
  ▼
CLAUDE.md routes queries: Brave → Exa → Tavily → Perplexity → Firecrawl
  │
  ▼
web-researcher.md subagent handles parallel search (tool-whitelisted)
  │
  ▼
MCP servers provide 14 search/extraction connections
  │
  ▼
When context gets high → /handover skill preserves session state
  │
  ▼
Hooks auto-save context before compaction and re-inject it after

What's Included

├── CLAUDE.md                          # System prompt: research methodology + tool roles
├── skills/
│   ├── deep-research/SKILL.md         # 6-phase research protocol (100+ sources)
│   └── handover/SKILL.md              # Session context preservation skill
├── agents/
│   └── web-researcher.md              # Tool-whitelisted research subagent
├── hooks/
│   ├── pre-compact.js                 # Saves transcript + context before compaction
│   ├── post-compact.js                # Re-injects context after compaction
│   └── notify.ps1                     # Windows notification (beep)
├── statusline-command.js              # Status bar: context %, 5h/7d usage limits
├── mcp-servers.json                   # MCP server config template (add your API keys)
├── settings.json                      # Permissions, hooks, and status line config
└── install.sh                         # Copies everything to ~/.claude/

Quick Start

1. Clone and install

git clone https://github.com/arm3n/claude-deep-research.git
cd claude-deep-research
bash install.sh

2. Get API keys

You need keys for the paid search engines. The free ones work without keys.

Service	Get Key	Free Tier
Brave Search	Dashboard	2,000 queries/mo
Exa	Dashboard	1,000 searches/mo
Tavily	Dashboard	1,000 searches/mo
Perplexity	API Settings	No free tier
Firecrawl	Dashboard	500 pages/mo

3. Add MCP servers to your Claude config

Open ~/.claude.json and merge the mcpServers block from mcp-servers.json, replacing YOUR_*_KEY placeholders with your actual API keys.

Windows users: Replace npx with the full path C:\\Program Files\\nodejs\\npx.cmd and add "PATH": "C:\\Program Files\\nodejs;%PATH%" to the env of each server. This is required because Git Bash drops Windows paths during MSYS2 conversion.

4. Merge settings

Copy settings.json to ~/.claude/settings.json (or merge it with your existing settings). Adjust hook paths if needed — the defaults use ~/.claude/hooks/.

5. Restart Claude Code

claude

Usage

Deep Research

Say any of these to trigger the research skill:

deep research on [topic]
investigate [topic]
comprehensive analysis of [topic]
saturated search on [topic]

The agent will:

Clarify the scope and identify 5-8 subtopics
Launch 5+ parallel search agents across all engines
Deduplicate and synthesize findings
Cross-verify claims across 2+ independent sources
Generate a structured report with confidence scores and 100+ citations
Save to reports/{topic}-{date}.md

Session Handover

When your context gets high (watch ctx:% in the status bar):

/handover

This saves all session state — decisions, failed approaches, file references — to a structured document. Start the next session by reading it back in.

Status Line

The status bar shows real-time metrics:

armen@DESKTOP  ~/project (main) ctx:42% 5h:23% 7d:0%

ctx:XX% — context window usage (green <50%, yellow 50-69%, red 70%+)
5h:XX% — 5-hour session rate limit (green <50%, yellow 50-79%, red 80%+)
7d:XX% — 7-day weekly rate limit

The 6-Phase Research Protocol

Phase 1: Scope Definition

Identify subtopics and plan query distribution (15-20 sources per subtopic).

Phase 2: Parallel Discovery (Target: 60+ sources)

Fan out across Brave (keywords), Exa (semantic), and Tavily (structured) simultaneously.

Phase 3: Synthesis & Gap Analysis (Target: 80+ sources)

Perplexity analyzes collected findings, identifies contradictions and gaps.

Phase 4: Deep Extraction (Target: 100+ sources)

Firecrawl scrapes top URLs. Paper-search for academic sources. YouTube transcripts for video content.

Phase 5: Verification

Every major claim cross-referenced against 2+ independent sources. Unverifiable claims flagged.

Phase 6: Report Generation

Structured markdown report with executive summary, confidence scores, inline citations, and complete source list.

Context Preservation System

Long research sessions can exceed Claude's context window. This setup includes a save-and-restore system:

PreCompact hook (pre-compact.js): Automatically triggers before compaction, saves the last 10 user messages, last 5 assistant outputs, and all referenced file paths to a snapshot file. Also backs up the raw transcript.
PostCompact hook (post-compact.js): After compaction, re-injects the saved snapshot so Claude remembers recent context.
Handover skill (/handover): Manual trigger that creates a comprehensive session document with decisions, failed approaches, next steps, and a resume prompt.

The 5 Compaction Failure Modes (and how this setup prevents them)

Failure Mode	What Happens	Prevention
Number rounding	"about 100" instead of "exactly 97"	Handover preserves exact numbers
Conditional collapse	"do X" instead of "do X unless Y"	Full IF/BUT/EXCEPT logic preserved
Rationale loss	"we chose A" without saying why	Decision table includes WHY + rejected alternatives
Relationship flattening	Loses "file A depends on B"	Cross-file relationships explicitly documented
Silent resolution	Open questions treated as settled	Open questions marked as OPEN

Search Engine Best Practices

Brave (Discovery)

Use advanced operators: site:, filetype:, intitle:
Always call sequentially (free tier: 1 req/sec rate limit)
Paginate with offset (0-9) for up to 200 results per query

Exa (Semantic)

Use natural language queries, not keywords: "articles explaining how X works"
Call after Brave discovery, for deeper exploration of promising threads
Can run in parallel with Tavily

Tavily (Verification)

Use tavily_search for quick facts, tavily_research for comprehensive investigation
Supports domain filtering for targeted verification
Can run in parallel with Exa

Perplexity (Synthesis)

Most expensive per query — reserve for complex tasks
perplexity_search for quick lookups, perplexity_research for deep-dives
perplexity_reason for contradiction resolution between conflicting sources

Firecrawl (Extraction)

Use after identifying high-value URLs from other engines
firecrawl_scrape for single pages, firecrawl_crawl for entire sites
Batch up to 10 URLs concurrently

Requirements

Claude Code CLI (v2.0+)
Node.js 18+ (for MCP servers via npx)
Python 3.10+ with uv (only for paper-search)
API keys for search engines (see Quick Start)

Customization

Adjusting the source minimum

Edit skills/deep-research/SKILL.md — change "100 unique sources" to your preferred target. For quick research, 30-50 sources is reasonable.

Adding/removing MCP servers

Edit mcp-servers.json and your ~/.claude.json. The system prompt in CLAUDE.md assigns roles to each engine — update it if you swap engines.

Changing the search engine roles

Edit the "Search Tool Roles" section in CLAUDE.md. The key principle: each engine has a distinct role to avoid duplicate queries.

Using with Sonnet instead of Opus

The system prompt references Opus 4.6 but works with any Claude model. Sonnet is faster and cheaper but produces less thorough research (fewer sources, less synthesis depth).

Cost Estimates

A typical 100-source deep research session costs roughly:

Brave: Free (within 2K/mo limit)
Exa: ~$0.50-1.00 (20-40 semantic searches)
Tavily: ~$0.30-0.60 (15-30 verification queries)
Perplexity: ~$1.00-3.00 (5-10 synthesis/reasoning queries)
Firecrawl: Free (within 500 pages/mo) or ~$0.50 (15-20 page extractions)
Claude tokens: Varies by plan (subscription or API)

Total per deep research session: ~$2-5 in search API costs (plus Claude usage).

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
agents		agents
hooks		hooks
skills		skills
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
install.sh		install.sh
mcp-servers.json		mcp-servers.json
settings.json		settings.json
statusline-command.js		statusline-command.js

Folders and files

Latest commit

History

Repository files navigation

Claude Deep Research

What This Is

Architecture

What's Included

Quick Start

1. Clone and install

2. Get API keys

3. Add MCP servers to your Claude config

4. Merge settings

5. Restart Claude Code

Usage

Deep Research

Session Handover

Status Line

The 6-Phase Research Protocol

Phase 1: Scope Definition

Phase 2: Parallel Discovery (Target: 60+ sources)

Phase 3: Synthesis & Gap Analysis (Target: 80+ sources)

Phase 4: Deep Extraction (Target: 100+ sources)

Phase 5: Verification

Phase 6: Report Generation

Context Preservation System

The 5 Compaction Failure Modes (and how this setup prevents them)

Search Engine Best Practices

Brave (Discovery)

Exa (Semantic)

Tavily (Verification)

Perplexity (Synthesis)

Firecrawl (Extraction)

Requirements

Customization

Adjusting the source minimum

Adding/removing MCP servers

Changing the search engine roles

Using with Sonnet instead of Opus

Cost Estimates

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages