Distill unstructured sources into qualified goals via AI.
Point Paperize at a folder of notes, ideas, research, markdown files — and it generates actionable goals, ready for any project management system.
Works with Obsidian vaults, Zettelkasten collections, research dumps, brainstorm folders, or any pile of text files. Handles hundreds of files through an intelligent map-reduce pipeline that never truncates your content.
One command:
npx paperize --source ~/notes
- Table of Contents
- Quick Start
- Install
- Usage
- How It Works
- Goal Format
- Options
- Supported File Types
- Development
- License
npx paperize --source ~/notesThat's it. Scans your folder, calls Claude, outputs goals as JSON.
Requires an ANTHROPIC_API_KEY — pass it inline or export it:
# Inline
ANTHROPIC_API_KEY=sk-ant-... npx paperize --source ~/notes
# Or export once
export ANTHROPIC_API_KEY=sk-ant-...npx paperize # run directly (no install)
npm i -g paperize # or install globally -> paperizeRequires Node.js 20+.
Pass --source to skip the wizard. No TTY required — fully scriptable.
# Scan and generate goals
paperize --source ~/notes
# Steer the AI with guiding context
paperize --source ~/research --context "Focus on SaaS product ideas"
# Read context from a file
paperize --source ~/research --context-file brief.md
# Save output in different formats
paperize --source ./ideas --output goals.json
paperize --source ./ideas --output goals.md --format markdown
paperize --source ./ideas --output goals.yaml --format yaml
# Control creativity level
paperize --source ~/notes --vibe wild # more ideas, speculative goals
paperize --source ~/notes --vibe focused # strict, high-confidence only
# Dry run — scan only, no AI
paperize --source ~/notes --dry-run
# Use a different model
paperize --source ~/notes --model claude-opus-4-6paperizeThe wizard walks you through six steps:
$ paperize
Paperize — Goal distillation from unstructured sources
Step 1 of 6 — Source
Enter path to source folder: ~/notes
Step 2 of 6 — Scan
Found 247 files (1.8 MB, 892K chars)
.md: 201 .txt: 38 .yaml: 8
Step 3 of 6 — Context
Add guiding context (optional): Focus on product roadmap items
Step 4 of 6 — Analyze
Strategy: map-reduce (9 batches)
✓ Batch 1/9 — extracted 12 ideas
✓ Batch 2/9 — extracted 8 ideas
...
✓ Synthesized 73 ideas into 7 goals
Step 5 of 6 — Goals
❯ ✓ Build a real-time collaboration engine
✓ Implement usage-based billing system
✓ Design onboarding flow for enterprise users
...
Step 6 of 6 — Done
✓ Wrote 7 goals to goals.json
Paperize automatically chooses the right strategy based on source size:
Single-shot — All files are combined into one document and sent to Claude in a single API call. Fast and cost-effective.
Map-reduce pipeline — A two-phase approach that handles arbitrarily large sources without truncation:
┌──────────────────────────────────────────────────────┐
│ Source files │
│ (hundreds/thousands of files) │
└──────────┬───────────┬───────────┬───────────────────┘
│ │ │
┌─────▼─────┐ ┌───▼───┐ ┌────▼────┐
│ Batch 1 │ │ Bat 2 │ │ Batch N │ Phase 1: Extract
│ ~100K ch │ │ │ │ │ (parallel, up to 3)
└─────┬─────┘ └───┬───┘ └────┬────┘
│ │ │
│ ideas + weights │
│ + attribution │
└───────────┼───────────┘
│
┌──────▼──────┐
│ Synthesize │ Phase 2: Synthesize
│ cluster & │ (single call)
│ prioritize │
└──────┬──────┘
│
┌──────▼──────┐
│ Goals │
│ title + │
│ description│
└─────────────┘
-
Extract — Files are split into ~100K-char batches. Each batch is processed in parallel to extract atomic ideas with source attribution and weight (strong/weak).
-
Synthesize — All extracted ideas are merged and clustered into coherent goals. Strong ideas are prioritized. Related ideas from different batches are combined.
Each goal is self-contained and independently actionable:
{
"title": "Build a real-time collaboration engine",
"description": "Context: Several notes mention the need for...\n\nScope: ...\n\nSuccess criteria: ..."
}| Field | Description |
|---|---|
| title | Concise, imperative voice, max ~80 chars |
| description | Context (why), scope (what), success criteria (how to measure) — max 2000 words |
JSON (default) — Array of goal objects.
Markdown — Each goal as an ## H2 with description body.
YAML — Structured YAML document with properly escaped strings.
| Flag | Description | Default |
|---|---|---|
--source <path> |
Path to folder with source material | (wizard prompt) |
--context <text> |
Guiding context/prompt for goal generation | — |
--context-file <path> |
Read guiding context from a file | — |
| Flag | Description | Default |
|---|---|---|
--model <model> |
Claude model for analysis | claude-sonnet-4-6 |
--max-goals <n> |
Maximum goals to generate | 10 |
--vibe <level> |
Creativity level: focused, balanced, wild |
balanced |
Vibe levels:
| Vibe | Extraction | Synthesis | Typical goals |
|---|---|---|---|
focused |
Only clear, actionable ideas | Merge aggressively, strong evidence only | 1–5 |
balanced |
Standard extraction | Balanced merging | 5–15 |
wild |
Everything — speculative, half-baked, creative leaps | Preserve breadth, let unusual ideas stand alone | 10–20 |
| Flag | Description | Default |
|---|---|---|
--output <path> |
Write goals to file | stdout |
--format <fmt> |
Output format: json, markdown, yaml |
json |
--dry-run |
Scan files only, skip AI analysis | off |
| Variable | Description |
|---|---|
ANTHROPIC_API_KEY |
Required for AI analysis |
.md .txt .text .markdown .org .rst .adoc .csv .json .yaml .yml .xml .html .htm
Skips: hidden directories, node_modules, .obsidian, .trash, __pycache__. Max 512 KB per file.
git clone https://github.com/Yesterday-AI/paperize.git
cd paperize
npm install
npm run build # esbuild: src/cli.jsx -> dist/cli.mjs
npm test # node --test src/logic/*.test.js
npm run lint # eslint src/
npm run format # prettier --write src/