Releases: egorfedorov/claude-context-optimizer
Release list
v4.3.0 — cache economics, /cco-overhead & a coach that knows when to shut up
Six upgrades
💰 Cache economics — price what actually bills
Session cost is now computed at real prompt-cache rates (cache reads 10%, writes 125% of input) from exact transcript usage — full-rate pricing overstated cost up to ~10×. The /cco board gains a Cache line with hit rate, $ saved vs uncached, and cache-break detection: every time a >5-min pause / mid-session CLAUDE.md edit / model switch re-writes your warm cache at 1.25×, you see it and its extra cost.
Cache ▓▓▓▓▓▓▓▓▓▓▓░ 93% hit · saved $90.36 vs uncached · 9 breaks (−$5.31)
📉 NEW /cco-overhead — the invisible per-session tax
Every session starts with a fixed payload (system prompt, tool schemas, MCP, agents, CLAUDE.md, memory) paid before you type a word. This audit measures it from ground truth, itemizes what's measurable locally, and tells you what to trim. Baseline cuts repay in every session.
🤫 Prompt Coach stops grading conversation
"спасибо, всё ок" is no longer an F-prompt with "name the file you want changed" injected into context. The coach classifies chat / question / task, coaches only tasks, goes easier on short follow-ups — and now understands Russian (also fixing a latent bug: JS \b never matched Cyrillic, so RU verb detection silently didn't work).
🛡 /cco-shield suggest|apply
Files wasted in 3+ sessions → ready .contextignore rules, applied with one command.
🎯 Self-calibrating estimates
Real transcript totals teach the token estimator each codebase's drift (EMA, clamped 0.5–2×).
🤖 Delegation advisor
Long read-only exploration streaks in main context now trigger a one-time hint to fan the work out to a subagent instead.
Tests: 150 → 170. Full details in CHANGELOG.md.
Update
claude plugin update claude-context-optimizer@cco🤖 Generated with Claude Code
v4.2.0 — ground truth
Real token counts (new)
The budget hook now reads exact API token usage from the Claude Code session transcript instead of estimating by character count. Budget %, warnings, and the /cco board show real context usage whenever the transcript is readable (real numbers drop the ~ prefix); estimation remains the fallback. New module: src/transcript-usage.js.
Fixed
$NaNin every row of the/cco-digestcost table (tokens ×MODEL_COSTSobject) — same class as the earlier roi/report fixes/cco-export(HTML) and/cco-claudemdoverstated costs 3× (hardcoded legacy $15/M Opus price)- Benchmark measured an empty file structure (contents passed where a path was expected) — published savings were fabricated;
results.jsonregenerated (honest total: 63%) Infinityxin/cco-roiat 100% waste- Race: tracker + budget hooks ran in parallel on the same PostToolUse matcher and clobbered the shared notice ledger — now serialized in one command (same for SessionEnd tracker → dashboard)
- Race: two sessions finalizing at once lost global stats (last-writer-wins) — now guarded by an atomic mkdir-based file lock with stale-steal
- prompt-coach rewrote its whole log per prompt (O(n²), truncation risk) → true append
- Legacy read-cache entries could throw and silently disable caching for a file
- Session summaries showed UTC as wall-clock time
Performance
- No more synchronous multi-MB file reads on the hook hot path (>1MB files estimated from size)
- Read token estimates capped by real file size + extension-aware ratio (~4× more accurate)
- Per-session state bounded (budget
filesLoaded500, tracker search log 300)
Honesty
- "Session pulse" now goes through the notice ledger — respects the per-session noise cap and counts as overhead in NET savings
Docs & packaging
- Real
CHANGELOG.md; README drift fixed (missing/cco-roi, file trees, stale versions here and indocs/index.html) sync-version.jsnow syncsmarketplace.json+docs/index.htmltoo- POSIX-shell requirement for hooks documented (Git Bash / WSL on Windows)
Tests: 139 → 150, all green on Node 18/20/22.
Full changelog: https://github.com/egorfedorov/claude-context-optimizer/blob/main/CHANGELOG.md
v4.1.0 — honest NET savings + silent-by-default hooks
Answers "does it really save tokens, or just say it does?" — makes the optimizer honest and stops it from spending the context it protects.
Silent-by-default hooks
Per-tool-call hooks now stay quiet unless a message is actionable (critical budget → /compact always shown), gated by a per-session noise budget (src/notices.js), deduped per kind. Removed the "CCO makes your budget Nx effective" self-praise.
Honest NET savings
- Saved tokens counted by each file's real cached size — a re-read of a 30-line file no longer credits "18K saved".
/ccoshows NET = saved − the tokens CCO's own messages injected (net-0 if overhead exceeds savings).
Big-file map-then-load
First full read of a file >1500 lines returns its structural map once, so Claude reads the section it needs instead of ~14K+ tokens. Read again to load fully. One-shot, config-gated.
139/139 tests pass, clean exit. Full diff: v4.0.0...v4.1.0
v4.0.0 — Opus 4.8 + Context Control Center (CI fixed)
The v3.6.0 release was broken — CI ran to the 6-hour GitHub timeout and was cancelled. v4.0.0 fixes that, moves the plugin to Opus 4.8 with corrected pricing, and ships a flagship one-screen Control Center.
🔧 CI fix (the actual bug)
budget.js and prompt-coach.js called main() unconditionally on import, and main() reads process.stdin — so when the test suite imported them the process never exited (all assertions passed; it just hung). All six hook modules now guard main() with isMainModule(); added a regression test. node --test now exits cleanly — 128/128 pass, CI ~17s.
💵 Opus 4.8 + corrected pricing
- Opus 4.7/4.8 are $5/$25 per MTok at a 1M context window — no premium tier. Default model is now
opus-4.8. - Removed the fictional
opus-4.7-1m$22.50/$112.50 "1M tier" (kept only as a back-compat alias → standard $5/$25 1M). - Sonnet 4.6 is 1M ($3/$15); Haiku 4.5 stays 200K ($1/$5).
- Updated
MODEL_COSTS, tests, README, skills, and the landing page.
🚀 Flagship — Context Control Center
/cco— one screen: budget %, $ spent, tokens saved by cache (effectiveness multiplier), waste/cold files, last prompt grade, the active task's cost, and ready-to-run actions./cco-task add | list | done— organize work by task with per-task token/$ attribution.- Session-end Auto-Optimizer report ("CCO saved you X tokens this session").
Full diff: v3.6.0...v4.0.0
v3.6.0 — Opus 4.7 ready: Prompt Coach + Smart Pack
Headline
This release shifts focus from "save tokens on reads" to "save tokens AND make every prompt produce sharper output", and retunes the plugin for Claude Opus 4.7 including the 1M context tier.
New features
🎯 Prompt Coach — grade and improve every prompt
A UserPromptSubmit hook scores your prompt on specificity, scope, success criteria, and length. When the score is below 80, suggestions are silently injected so Claude works against a sharpened prompt — not a vague one.
[prompt-coach] Prompt quality: D (38/100).
Suggestions to make this prompt produce better results:
- Name the specific file(s), function(s), or module(s) you want changed.
- Bound the scope: instead of "all bugs / rewrite everything", pick one concrete failure.
- State the success condition: what tests pass? what error disappears?
Run /cco-coach to grade an arbitrary prompt or see prompt history.
Zero LLM calls. Pure local heuristics. <5ms latency.
📦 Smart Context Pack — optimal file set for your task
/cco-pack "refactor login flow to support OAuth"Builds a ranked, token-budget-aware file list from mentioned paths + git diff + historical patterns + keyword match. Each file gets a suggested offset/limit based on structural landmarks. Stops at 25% of your effective context.
💡 1M context tier
/cco-budget model opus-4.7-1mSwitching models retunes the entire plugin: Read Cache staleness thresholds scale (100K/40-files/10min vs 20K/8/10min), cost calculations use 1M-tier prices ($22.5/$112.5 per M), warnings fire at the right percentages.
💰 Real cost tracking
Budget monitor now tracks input AND output tokens separately and uses real per-model prices. Reported cost = actual cost.
🔌 MCP tool tracking
PostToolUse matchers now include mcp__.* — Linear, Slack, GitHub, Postgres et al. show up in token reports.
🩺 Doctor
/cco-doctorOne-second health check: version sync, hooks wiring, data dir, model config, node version.
Plumbing & fixes
- Atomic JSON writes — prevents
patterns.jsoncorruption when concurrent hooks finalize. - Unified file classification — single source of truth in
utils.js(was duplicated across 3 files). - Extended TOKEN_RATIOS — added
.svelte,.vue,.sh,.php,.lua,.ex,.clj,.scala,.swift,.dartand ~15 more. - ContextShield false-positive fix — useful read-only files (schemas, type defs) no longer get spurious warnings.
- PPID list bounded to last 20 entries in Read Cache (memory safety).
- CCO_QUIET=1 / CI=true suppresses donation banner in machine outputs.
- Plugin / package version sync —
scripts/sync-version.js+prepublishOnlyhook prevents the recurring stale-manifest issue.
Models supported
| Model | Input/M | Output/M | Window |
|---|---|---|---|
haiku-4.5 |
$1 | $5 | 200K |
sonnet-4.6 |
$3 | $15 | 200K |
opus-4.7 |
$15 | $75 | 200K |
opus-4.7-1m |
$22.5 | $112.5 | 1M |
Tests
138 subtests, 0 failures. New coverage: model costs, effective budget, prompt-coach scoring, budget input/output split, unified classification, extended token ratios.
Upgrade
claude plugin update claude-context-optimizer@egorfedorov-plugins
# or in a manual clone:
git pull && npm testRun /cco-doctor after upgrading to confirm everything is wired up.
v3.5.0 — Benchmark Proof, ROI Calculator, Live Stats & CI/CD
What's New
Benchmarked Proof of Savings
Run npm run benchmark — 63% fewer tokens across 7 reproducible scenarios:
| Scenario | Savings |
|---|---|
| Read Cache Dedup (JS) | 50% |
| Read Cache (5x reads) | 79% |
| File Digest vs Full Read (JS) | 99% |
| Read Cache Dedup (JSON) | 50% |
| Contextignore Block (lockfile) | 100% |
| File Digest vs Full Read (MD) | 99% |
| Typical Session (10 files, 45 min) | 43% |
/cco-roi — ROI Calculator
Personalized savings report based on your actual session data:
- Monthly/yearly dollar savings per model (Haiku, Sonnet, Opus)
- Effective context multiplier (e.g. "1.7x — your 200K context behaves like 340K")
- Team ROI projections
Live Session Pulse
Every 15 tool calls, a compact one-liner appears:
[cco] Session pulse: 45K used, 18K saved by CCO (28% efficiency)
Effective Budget Multiplier
At 50%+ budget usage:
[context-budget] CCO makes your budget 1.4x more effective (18K tokens saved)
CI/CD
- GitHub Actions CI: tests on Node 18, 20, 22 on every push/PR
- npm publish workflow: auto-publishes on GitHub release
Redesigned Landing Page
Data-driven landing page with benchmark table, ROI calculations, token accumulation chart, and effective context multiplier. Inspired by best practices from tamp.dev.
Updated Model Prices
Prices now reflect Claude Haiku 4.5, Sonnet 4.6, Opus 4.6.
Full Changelog: v3.4.0...v3.5.0
v3.4.0: Read Cache v2.0 — Smart Context-Aware Blocking
🔧 Critical Fix: Blocked reads no longer leave AI blind
Read Cache v1.1 had a fundamental flaw — once a file was fully read, all subsequent reads were permanently blocked, including offset/limit requests. The block message suggested "use offset/limit" but that didn't work either since the full range already covered every sub-range. This forced the AI into blind Grep workarounds without surrounding context, degrading code quality.
What's New in Read Cache v2.0
📋 File Structure Digest
When a read is blocked, instead of a useless error message, you now get a navigational map of the file:
⛔ [read-cache] Already loaded utils.js (2000 lines, ~18.4K tokens). File unchanged.
📋 File map (2000 lines):
8: imports (8–10)
45: function estimateTokens()
63: function displayPath()
103: function loadConfig()
157: function computeUsefulness()
→ Example: Read with offset=102, limit=50 to see function loadConfig()
~100 tokens instead of ~18K for re-reading. The AI knows exactly where to jump.
⏰ Staleness Detection
Automatically allows re-reads when the file's context has likely been evicted:
- Token displacement: 20K+ tokens of other files loaded since this one
- File displacement: 8+ other files read since this one
- Time decay: 10+ minutes since last read
No more permanent blocks on files that were pushed out of active context.
🌐 Multi-Language Digest
File structure parser supports: JS/TS, Svelte/Vue, Python, Go, Rust, C++, JSON, YAML
Stats
- 85 tests, all passing (21 new)
- New module:
src/file-digest.js— reusable structural parser read-cache.jsupgraded from v1.1 → v2.0
v3.3.0: .contextignore, Auto-Compact, Session Replay
Three New Features
1. .contextignore — block files you never need
Create a .contextignore in your project root (works like .gitignore) to permanently block wasteful reads:
package-lock.json
yarn.lock
*.min.js
dist/**
node_modules/**
🚫 [contextignore] package-lock.json matches pattern "package-lock.json" in .contextignore.
Use Grep to search inside, or remove the pattern from .contextignore to allow reading.
- Project-level (
.contextignore) + global (~/.claude/.contextignore) - Supports globs:
*.lock,dist/**,*.min.js,*.generated.* - Copy
.contextignore.examplefrom the plugin to get started - Zero dependencies — pure Node.js pattern matching
2. Auto-Compact — automatic context cleanup
When your budget hits 80%, the plugin actively tells Claude to run /compact — not just a warning. At 90%, it becomes urgent.
[context-budget] ⚡ Auto-compact recommended — 80% budget used.
[context-budget] 🔴 Critical: 90% budget used. Run /compact immediately.
- Toggle:
/cco-budget auto onor/cco-budget auto off - Configurable thresholds in
budget-config.json - Smart rate-limiting — won't spam every tool call
3. Session Replay — pick up where you left off
Every session auto-generates a summary. Run /cco-replay to see recent sessions:
[1] Session Mar 24 14:30 (12 min)
Edited: src/read-cache.js, src/utils.js, README.md (3 files)
Context: 45K tokens, 12 files read, 28% waste
No need to re-read files or guess what happened in the last session.
Stats
- 12 files changed, +845 -20 lines
- 4 new files:
src/contextignore.js,src/replay.js,skills/cco-replay/SKILL.md,.contextignore.example - 66 tests passing (21 new)
- Zero breaking changes
Full Feature List (v3.3.0)
- Smart Read Cache (PPID-aware)
.contextignorefile blocking- ContextShield waste prevention
- Auto-compact at 80% budget
- Session replay summaries
- Project Anatomy
- CLAUDE.md Analyzer
- HTML Dashboard
- Efficiency Digest (S/A/B/C/D/F)
- Token Budget tracking
- Git-aware suggestions
- Context Templates
- Smart Loader
- Confidence Learning
v3.2.0: Friendly UX + Read-Cache PPID Tracking
What's New
Friendly UX — all messages rewritten
Every user-facing message across 11 files has been rewritten to be friendlier and less intimidating:
- ⛔ scary block icon → 💾💤🔄💡🎯 friendly icons
- "was WASTED / tokens burned" → "went unused"
- "Consider: do you really need the full file?" → "Heads up: Try Grep to grab just what you need"
- Encouraging session-end messages, pro-tips, and warmer tone throughout
Read-Cache v1.1 — Agent subprocess fix
Bug fix: Agent subprocess reads no longer block reads in the main conversation.
When you use the Agent/Explore tool, the agent reads files into its own context — but previously, the read-cache would then block the main conversation from reading those same files. Now the cache tracks process.ppid to distinguish process contexts. Different context = different cache namespace.
Smart hint messages
- Full-file re-read blocked → "Tip: use offset/limit to read a different section"
- Partial re-read blocked → "This section is already loaded"
- No more misleading "use offset/limit" when the entire file is already cached
Files Changed
src/read-cache.js— v1.1 with PPID tracking + smart hintssrc/context-shield.js— friendlier waste warningssrc/budget.js— softer budget notificationssrc/tracker.js— warm heatmap output + session-end messagessrc/report.js— encouraging recommendationssrc/digest.js— supportive tips and feedbacksrc/utils.js— updated donation messagesrc/claudemd-analyzer.js— friendlier analysis outputskills/cco/SKILL.md,skills/cco-shield/SKILL.md— warmer empty-state messagesREADME.md— updated examples + agent tracking docs
Stats
- 11 files changed, +77 -52 lines
- 45 tests passing
- Zero breaking changes
v3.1.0 — Smart Read Cache & Project Anatomy
What's New
Smart Read Cache — the killer feature
PreToolUse hook that blocks redundant file reads automatically. If Claude already loaded a file and it hasn't changed (mtime check), the read is blocked — saving 100% of those tokens.
- First read: always allowed
- File modified since last read: allowed
- Different section (offset/limit): allowed if not already covered
- Same file, same range, unchanged: blocked
Typical savings: 30-60% fewer tokens per session.
Project Anatomy
New /cco-anatomy command generates a compact codebase map — every file with line count, token estimate, and category. Claude reads one file instead of opening twenty to understand the project.
Honest README
- Renamed "Interactive Dashboard" → "HTML Dashboard Export" (it generates a static HTML file, not a live dashboard)
- Updated all feature descriptions to accurately reflect what they do
- Added new features to flow diagram, commands table, and plugin structure
Technical
src/read-cache.js— new PreToolUse hook, runs before ContextShieldsrc/anatomy.js— project anatomy generator usinggit ls-filesskills/cco-anatomy/SKILL.md— new skillutils.js— addedREAD_CACHE_DIRexporthooks.json— read-cache hook added to PreToolUse pipeline