Persistent project-scoped store for deep research findings, with progressive disclosure and contrarian-pass investigation.
A Claude Code skill that gives you a persistent, project-scoped store for deep research findings.
Stop re-researching the same topics across sessions. Stop polluting conversation context with raw web search dumps. The skill maintains a structured local knowledge base under <project>/.research/, looks it up before fetching the web, and uses progressive disclosure to load only what's actually needed.
Long Claude Code sessions run out of context. /compact summarizes older turns and drops the rest, so findings from a deep research thread evaporate and the next question re-triggers the same web searches.
This skill makes the data layer outlive the chat. Research written today survives /compact, /clear, IDE restarts, and machine moves. The next session reads INDEX.md first (a tiny dispatcher), matches the topic, and pulls only the matched entry's ## Summary section into context. The full body stays on disk until you actually need it.
Loading tiers, cheapest first:
| Tier | Loads | Approx tokens | When |
|---|---|---|---|
| 1 | INDEX.md |
100 to 500 | Every retrieval |
| 2 | Entry's ## Summary only |
50 to 200 | When INDEX shows a match |
| 3 | Full FINDINGS.md body |
500 to 3000 | When the summary doesn't cover it |
Heavy research artifacts become cheap to recall: you only pay for the tier you need.
- Project-scoped, not global. Each repo has its own research store, kept private (gitignored by default).
- Progressive disclosure. Index, then summary, then full body, in that order. Most lookups never load the full entry.
- Conflict-handling history. When findings change, old claims move to a
## Discarded approachestable with reasons; never silently overwritten. Prevents re-trying refuted approaches. - Subagent-isolated investigation. Heavy WebSearch / WebFetch traffic runs in a separate
general-purposesubagent (Opus 4.7 by default). Your main context stays clean. - Async, non-blocking. The investigation subagent runs in background mode (
run_in_background: true); your conversation with Claude Code stays interactive while research happens. Findings save and announce themselves on the completion notification. No frozen CLI. - Cognitive phases. Decompose, Gather, Validate, Contrarian pass, Synthesize. The contrarian pass actively searches for "why this is wrong" rather than confirming. It earns its keep.
When the Investigation subagent finishes, its full structured return (Summary, Findings, contrarian objection, sources) is injected into the main agent's context as a task-notification message. No file round-trip, no tail-the-log polling. The main agent parses the return directly and writes the data layer.
Why this matters:
- No raw web-search dump pollution. The main agent only sees the agent's clean synthesized output, never the raw web search results or fetch responses. Those live in a separate transcript file the main agent is forbidden to read.
- Storage is deterministic. The required output format maps 1:1 to the FINDINGS.md schema. Parsing is mechanical, not interpretive.
- Conversation stays interactive. The subagent runs in background mode (
run_in_background: true), so you keep working while it researches; the structured output arrives as a notification when the agent completes.
Three install routes, all global. No registration, approval, or login required.
npx skills add hec-ovi/research-skill/plugin marketplace add hec-ovi/research-skill
/plugin install research@research-skill
/reload-plugins
This uses Claude Code's built-in marketplace mechanism to install the plugin from the maintainer's GitHub repo. It is not Anthropic's first-party catalog.
# Personal (across all your projects)
git clone https://github.com/hec-ovi/research-skill ~/.claude/skills/research
# Or project-only
git clone https://github.com/hec-ovi/research-skill <your-project>/.claude/skills/researchClaude Code picks up new skills live, no restart needed.
The skill auto-activates when you ask a research-style question. You can also invoke it explicitly:
/research <topic>
Examples:
- "What's the latest TypeScript ORM for edge runtime in 2026?"
- "Compare Bun vs Node cold-starts for serverless"
- "/research drizzle-type-generation"
The skill writes to your project, not your home dir:
<project>/.research/
├── INDEX.md # dispatcher: topic table, scanned first
└── <topic-slug>/
├── FINDINGS.md # entry: frontmatter + summary + findings + history
└── raw/ # optional: pasted PDFs, whitepapers, etc.
INDEX.md is the dispatcher, equivalent to RESOLVER.md in the GBrain pattern. The agent reads it first, then loads only the matched entry's ## Summary section. Full entries and raw documents only load on demand.
.research/ is a top-level project directory (sibling of .claude/, not nested inside it). It's colocated with the project, gitignored by default, and easy to find by name. Reads and writes go through the host's normal permission system.
- Plan-stage notes
- Small facts or one-line preferences
- Code-level decisions tied to one file
- Casual lookups answerable from a single source
- A substitute for a single WebSearch / WebFetch
If the question fits in one search plus 1 to 2 sentences, you don't need this skill.
Built explicitly on three open patterns; credit where due.
Frontmatter and folder layout follow the open Agent Skills specification (Apache 2.0 / CC-BY-4.0). Portable across Claude Code and any other code CLI that implements the SKILL.md format.
The Investigation phase walks a 5-step cognitive workflow (Decompose, Gather, Validate, Contrarian pass, Synthesize) adapted from xAI's published Multi-Agent architecture and the DeepSearch announcement. xAI ships 4 specialized agents (Captain, Harper, Benjamin, Lucas) on a shared backbone; this skill condenses those into cognitive phases a single subagent walks, since the Claude Code harness does not currently expose subagent continuation (SendMessage unavailable as of April 2026).
The Contrarian pass (phase 4) is the standout borrowed element: actively searching for "why this is wrong" rather than confirming. In an A/B test on a celebrity-fronted AI tool legitimacy question, the contrarian pass surfaced significant controversy that a minimal-brief baseline missed.
INDEX.md acts as a dispatcher in the same role as RESOLVER.md in GBrain. The INDEX is scanned first; full entries load only on match. Progressive disclosure tiers borrow GBrain's "thin harness, fat skills" philosophy (THIN_HARNESS_FAT_SKILLS.md).
Every entry's FINDINGS.md has structured frontmatter (topic, created, last_verified, status, related, sources, raw) and a body with ## Summary, ## Findings, ## Discarded approaches, ## Open questions, ## Timeline. See SKILL.md for the full schema and rules.
- A code CLI that implements the SKILL.md format (Claude Code, or any other compatible client)
- For the Investigation phase: an Opus-class model accessible to the spawning agent (the skill defaults to spawning subagents at
model: "opus")
The Investigation phase needs reasoning depth. The skill spawns subagents with model: "opus", but the calling agent has to actually pass that parameter on every spawn. To make it systematic across all your Claude Code sessions, configure your environment to default subagents to Opus.
Two practical approaches:
- Hook (strongest): add a
PreToolUsehook on theAgenttool in~/.claude/settings.jsonthat blocks any spawn whosemodelfield is notopus. The hook runs before the tool dispatches, so a non-opus spawn never reaches the API. - Convention (lightest): add a one-line note to your
~/.claude/CLAUDE.md: "Every Agent tool call MUST passmodel: \"opus\"." Claude reads CLAUDE.md every session.
Smaller models work fine for the main conversation. The contrarian pass and synthesis steps in Investigation specifically depend on Opus-class reasoning depth; smaller models tend to skip the contrarian phase or produce shallow syntheses.
When the skill activates, the full SKILL.md body loads into the main agent's context. As of v0.2.7 the activation cost is approximately 4,500 to 5,500 tokens (depending on tokenizer). The skill registration metadata (frontmatter only, always loaded) is a separate ~130 tokens.
Comparison points:
- Most Vercel
skills.shreference skills sit at 500 to 2,000 tokens - Anthropic reference skills typically run 1,500 to 3,000 tokens
- This skill is roughly double that
The reason is that the Investigation phase is a substantive procedure (5 cognitive phases verbatim, brief checklist, citation rules, required output format, gap handling) and the Storage phase has its own validation rules. The didactic content is real; it earns its keep when Investigation actually runs. But on Retrieval-only calls (the most common path), the agent loads all of it just to get to the loading-hierarchy and lookup-procedure sections.
The clean architectural answer is to apply progressive disclosure recursively, the same pattern the skill already applies to research data. Specifically: split the single SKILL.md into a thin dispatcher plus phase-specific procedure files that load only when their phase is active.
Target structure:
research/
├── SKILL.md # thin dispatcher: when, where, which procedure to load
└── procedures/
├── retrieval.md # loading hierarchy, lookup procedure, INDEX patterns
├── investigation.md # cognitive phases, brief checklist, citation rules, output format
└── storage.md # FINDINGS schema, Review-before-storing, conflict handling
How it would work mechanically:
- The agent activates the skill and reads
SKILL.md(cheap, ~1,500 to 2,000 tokens). SKILL.mdnames which procedure file to load for each phase: "For Retrieval, readprocedures/retrieval.md. For Investigation, readprocedures/investigation.mdAFTER deciding mode in Retrieval phase 5. For Storage, readprocedures/storage.mdafter Investigation returns."- The agent uses the
Readtool to pull only the procedure file relevant to the current phase. A pure Retrieval call (read INDEX, sed Summary, answer) never touchesinvestigation.mdand never pays for it.
This is the same "thin harness, fat skills" pattern GBrain uses (RESOLVER.md as the dispatcher, individual skill files loaded on demand). Applied here, it's a natural fit because the three phases (Retrieval, Investigation, Storage) are already cleanly separated in the workflow, and the heaviest procedure (Investigation) is also the least-frequent path. Most calls are Retrieval-only.
Expected post-refactor footprint:
SKILL.md(always loaded on activation): ~1,500 to 2,000 tokensprocedures/retrieval.md(loaded on every research-style question): ~800 to 1,200 tokensprocedures/investigation.md(loaded only when fresh research is needed): ~1,500 to 2,000 tokensprocedures/storage.md(loaded only when actually writing): ~800 to 1,200 tokens
A typical Retrieval-only call would pay ~2,500 to 3,200 tokens (SKILL.md + retrieval.md), down from the current ~5,000. An Investigation call would pay roughly the same as today (all phases involved), but at least the cost would be honest: you pay for what you use.
The skill is working, the size is heavy but not blocking, and there has been no demand from users yet. The repo just launched on 2026-04-25; the only confirmed user is the maintainer, and the maintainer has not hit context-budget pressure on this skill in real workflows. The refactor is a real restructure: probably half a day of careful editing, plus end-to-end testing on every phase (Retrieval new entry, Retrieval merge, Investigation new entry mode, Investigation merge mode, Storage paste path), plus updating the install routes (the symlink at skills/research/SKILL.md would need to extend to the procedures folder; the marketplace.json plugin descriptor would need to confirm subdirectory loading is honored), plus rewriting CHANGELOG and bumping to a minor version (likely v0.3.0).
The refactor will be triggered when any of these signals lands:
- A user reports the activation cost as a real friction in their context budget
- A new feature pushes
SKILL.mdpast 6,000 tokens - The procedure content grows organically to the point where the dispatcher spine already feels redundant
- Someone files an issue or PR proposing the split
Until one of those triggers, the single-file structure is the right call: everything is in one place, the file is readable end-to-end, and the load-bearing optimization (progressive disclosure of the actual research data via INDEX, then Summary, then full body) already works as designed. Optimizing the SKILL.md itself before there is felt friction is premature engineering.
If you want to discuss the refactor or volunteer feedback on activation cost in your own usage, open an issue at github.com/hec-ovi/research-skill/issues.
MIT. Free to use, modify, fork, distribute. Attribution appreciated, not required.