|
1 | | -# agenttop |
| 1 | +<p align="center"> |
| 2 | + <img src="assets/logo.png" alt="agenttop" width="120"> |
| 3 | +</p> |
2 | 4 |
|
3 | | -`htop` for AI coding agents. |
| 5 | +<h1 align="center">agenttop</h1> |
4 | 6 |
|
5 | | -```bash |
6 | | -git clone https://github.com/vicarious11/agenttop && cd agenttop && ./setup.sh |
7 | | -./run.sh # localhost:8420 |
8 | | -``` |
| 7 | +<p align="center"> |
| 8 | + <b>See where your AI coding tokens and money actually go.</b> |
| 9 | + <br> |
| 10 | + <sub>htop for Claude Code, Cursor, Kiro, Codex, and Copilot.</sub> |
| 11 | +</p> |
9 | 12 |
|
10 | | - |
| 13 | +<p align="center"> |
| 14 | + <a href="#install">Install</a> · |
| 15 | + <a href="#what-you-see">Screenshots</a> · |
| 16 | + <a href="#features">Features</a> · |
| 17 | + <a href="#how-it-works">How it works</a> · |
| 18 | + <a href="#ai-analysis">AI Analysis</a> · |
| 19 | + <a href="#architecture">Architecture</a> |
| 20 | +</p> |
11 | 21 |
|
12 | | -Monitors **Claude Code**, **Cursor**, **Kiro**, **Codex**, **Copilot**. Reads the local files they already write (`~/.claude/`, `~/.cursor/`, etc). Read-only. Nothing leaves your machine. |
| 22 | +<p align="center"> |
| 23 | + <img src="https://img.shields.io/badge/python-3.10%2B-blue" alt="Python"> |
| 24 | + <img src="https://img.shields.io/github/license/vicarious11/agenttop" alt="License"> |
| 25 | + <img src="https://img.shields.io/badge/tools-5%20supported-green" alt="Tools"> |
| 26 | + <img src="https://img.shields.io/badge/telemetry-zero-brightgreen" alt="No telemetry"> |
| 27 | +</p> |
13 | 28 |
|
14 | | -## what it does |
| 29 | +--- |
15 | 30 |
|
16 | | -- unified dashboard across all your AI coding tools |
17 | | -- every session, every prompt, every token, every dollar — one place |
18 | | -- search sessions by project, sort by cost, view full prompt history |
19 | | -- AI analysis: scores you 0-100 on session hygiene, prompt quality, cost efficiency, cache usage, tool utilization |
20 | | -- cost forensics: spend by project, by model, estimated waste from marathon sessions |
21 | | -- detects anti-patterns: correction spirals, context blowup, repeated prompts, model overkill |
| 31 | + |
22 | 32 |
|
23 | | -## install |
| 33 | +> Every AI coding tool stores usage data locally — JSONL logs, SQLite databases, workspace state — but none of them show you the full picture. agenttop reads all of it, normalizes it, and gives you a real-time dashboard with AI-powered analysis. 5 tools. One view. Nothing leaves your machine. |
| 34 | +
|
| 35 | +--- |
| 36 | + |
| 37 | +## Install |
24 | 38 |
|
25 | 39 | ```bash |
26 | 40 | git clone https://github.com/vicarious11/agenttop && cd agenttop && ./setup.sh |
27 | 41 | ``` |
28 | 42 |
|
29 | | -or `pip install agenttop` |
30 | | - |
31 | | -## run |
| 43 | +That's it. Handles Python, venv, deps, everything. Then: |
32 | 44 |
|
33 | 45 | ```bash |
34 | | -./run.sh # web dashboard |
35 | | -.venv/bin/agenttop # terminal dashboard |
36 | | -agenttop init # set up LLM for analysis (ollama/anthropic/openai) |
| 46 | +source .venv/bin/activate |
| 47 | +agenttop # terminal dashboard |
| 48 | +agenttop web # web dashboard at localhost:8420 |
| 49 | +agenttop stats # quick CLI summary |
| 50 | +agenttop init # configure LLM for AI analysis |
37 | 51 | ``` |
38 | 52 |
|
39 | | -## data sources |
| 53 | +Requirements: Python 3.10+. No Docker. No API keys needed. macOS, Linux, Windows. |
| 54 | + |
| 55 | +Keyboard: `d` dashboard · `s` sessions · `e` explorer · `a` analysis · `k` graph · `1-4` time range · `q` quit |
| 56 | + |
| 57 | +## What You See |
| 58 | + |
| 59 | +### Terminal Dashboard |
40 | 60 |
|
41 | 61 | ``` |
42 | | -~/.claude/projects/**/*.jsonl exact token counts per message |
43 | | -~/.cursor/ai-tracking/*.db conversations, models, AI vs human ratio |
44 | | -~/.codex/.codex-global-state.json prompts, automations |
45 | | -~/.config/github-copilot/ session state |
46 | | -~/Library/.../Kiro/state.vscdb workspace data |
| 62 | +All time 17.8M tok $687 cost 265 sess 5.6K msgs 5 tools 87% cache |
| 63 | +
|
| 64 | +COST BY PROJECT COST BY MODEL |
| 65 | +apex-trading-engine ████████ $284 opus-4-6 ████████████ $412 |
| 66 | +vaultkeeper █████ $148 sonnet-4-6 █████ $198 |
| 67 | +phantom-search ████ $97 haiku-4-5 ██ $76 |
| 68 | +neon-ui ██ $63 |
| 69 | +dataweave ██ $51 |
| 70 | +
|
| 71 | +DAILY COST (30d) ACTIVITY BREAKDOWN |
| 72 | +▁▃▅▇█▇▅▃▁▂▄▆█▇▅▃▁▂▅▇█▇▅▃▁▃▅▇ coding ████████ 42% |
| 73 | +total $687 avg $23/day peak $45 debugging ████ 21% |
| 74 | + testing ███ 15% |
| 75 | +TOOLS exploration ██ 9% |
| 76 | +● Claude Code 180 sess $469 |
| 77 | +● Cursor 45 sess $107 ONE-SHOT RATE |
| 78 | +● Kiro 20 sess $53 87% ██████████████████░░ |
| 79 | +● Codex 12 sess $53 edits that pass first try |
| 80 | +● Copilot 8 sess $5 higher = better prompting |
47 | 81 | ``` |
48 | 82 |
|
49 | | -## architecture |
| 83 | +Six panels. No plotext. Pure Rich text rendering. All data computed from actual tool calls, not keyword guessing. |
50 | 84 |
|
| 85 | +### Web Dashboard |
| 86 | + |
| 87 | +Three tabs: **Overview** · **Sessions** · **Analyze** |
| 88 | + |
| 89 | +- **Overview** — force-directed knowledge graph (D3), model usage (input/output/cache), hourly activity, cost breakdown, workflow intelligence |
| 90 | +- **Sessions** — full-page browser with Google-style pagination. Search by project or prompt. Sort by cost, time, tokens. Click any session to see complete prompt history |
| 91 | +- **Analyze** — select sessions (All / Last 10 / Top Cost), run LLM analysis, get a deep-dive report with score, grades, cost forensics by project and model, anti-patterns, recommendations with estimated savings |
| 92 | + |
| 93 | +Keyboard: `o` overview · `s` sessions · `a` analyze. URL hash routing (`#sessions`, `#analyze`) for deep links. |
| 94 | + |
| 95 | +## Features |
| 96 | + |
| 97 | +### Data Extraction |
| 98 | + |
| 99 | +| Tool | Data Source | What agenttop extracts | |
| 100 | +|------|------------|----------------------| |
| 101 | +| **Claude Code** | `~/.claude/projects/**/*.jsonl` | Exact per-message token counts (input, output, cache read, cache create). Per-message model ID. **Every tool call name** (Edit, Bash, Read, Grep, Agent, Write — extracted from `tool_use` content blocks). Up to 50 user prompts per session. Project path from `cwd` field. Cost from per-model pricing. | |
| 102 | +| **Cursor** | `~/.cursor/ai-tracking/ai-code-tracking.db` | Conversations from SQLite. Source type (tab/composer/chat). AI vs human code ratio from `scored_commits`. Model per code hash. Project resolution via `ide_state.json` workspace mapping. | |
| 103 | +| **Kiro** | `~/Library/.../Kiro/User/globalStorage/state.vscdb` | Session data from VS Code state DB. Keys matching `kiro%`, `chat%`, `session%` patterns. Message counts and timestamps. | |
| 104 | +| **Codex** | `~/.codex/` | Prompt history from `.codex-global-state.json`. Session files from `sessions/` rollouts. Automation data from SQLite. Config (model, reasoning effort). | |
| 105 | +| **Copilot** | `~/.config/github-copilot/session-state/` | Per-session JSON with message content. Model extraction. Custom agent detection. Token estimation from content length. | |
| 106 | + |
| 107 | +All read-only. agenttop never modifies your tool data. |
| 108 | + |
| 109 | +### Activity Classification |
| 110 | + |
| 111 | +Deterministic. No LLM. Classified from **actual tool call data** when available (Claude Code), falls back to prompt keywords for other tools. |
| 112 | + |
| 113 | +| Activity | How it's detected | |
| 114 | +|----------|------------------| |
| 115 | +| **coding** | Edit, Write, MultiEdit tool calls | |
| 116 | +| **debugging** | Bug/error/fix keywords in prompts + Edit/Bash patterns | |
| 117 | +| **testing** | Bash calls with pytest/jest/vitest/cargo test | |
| 118 | +| **exploration** | Read, Grep, Glob calls without edits | |
| 119 | +| **refactoring** | Refactor/rename/extract keywords + Edit patterns | |
| 120 | +| **git ops** | Bash calls with git commands | |
| 121 | +| **planning** | EnterPlanMode, TaskCreate, Agent tool calls | |
| 122 | +| **other** | Everything else | |
| 123 | + |
| 124 | +### One-Shot Success Rate |
| 125 | + |
| 126 | +Percentage of edit turns that pass without retry. Detects `Edit -> correction prompt -> Edit` retry cycles in your prompt history. Higher percentage = better prompting, less wasted tokens. |
| 127 | + |
| 128 | +When `tool_breakdown` is available (Claude Code), uses actual Edit/Write call counts. Falls back to prompt analysis for other tools. |
| 129 | + |
| 130 | +### Cost Analysis |
| 131 | + |
| 132 | +- **Cost by project** — which project burns the most money, with session count |
| 133 | +- **Cost by model** — opus vs sonnet vs haiku spend, computed from actual per-model pricing (input/output/cache rates) |
| 134 | +- **Daily cost sparkline** — 30-day unicode trend with total, average, and peak |
| 135 | +- **Cache hit rate** — from actual `cacheReadInputTokens` vs `inputTokens` in Claude Code data |
| 136 | + |
| 137 | +### Session Data Model |
| 138 | + |
| 139 | +Each session stores: |
| 140 | + |
| 141 | +```python |
| 142 | +Session( |
| 143 | + tool_breakdown={"Edit": 5, "Bash": 3, "Read": 12, "Grep": 4}, # actual tool calls |
| 144 | + models_used={"claude-opus-4-6": 8, "claude-sonnet-4-6": 12}, # per-message model |
| 145 | + prompts=["fix the race condition in...", ...], # up to 50 |
| 146 | + total_tokens=48291, # exact for Claude, estimated for others |
| 147 | + estimated_cost_usd=12.47, # per-model pricing |
| 148 | + message_count=23, |
| 149 | + tool_call_count=24, |
| 150 | + # + id, tool, project, start_time, end_time |
| 151 | +) |
51 | 152 | ``` |
52 | | - ┌──────────────────────────────────────────────┐ |
53 | | - │ YOUR MACHINE (read-only) │ |
54 | | - │ │ |
55 | | - │ ~/.claude/ ~/.cursor/ ~/.codex/ ... │ |
56 | | - └──────┬───────────┬────────────┬──────────────┘ |
57 | | - │ │ │ |
58 | | - ▼ ▼ ▼ |
59 | | - ┌──────────────────────────────────────────────┐ |
60 | | - │ COLLECTORS │ |
61 | | - │ │ |
62 | | - │ ClaudeCodeCollector → JSONL parser │ |
63 | | - │ CursorCollector → SQLite + workspace │ |
64 | | - │ KiroCollector → VS Code state DB │ |
65 | | - │ CodexCollector → JSON + SQLite │ |
66 | | - │ CopilotCollector → session JSON │ |
67 | | - │ │ |
68 | | - │ Each: collect_sessions() → list[Session] │ |
69 | | - │ get_stats(days) → ToolStats │ |
70 | | - └──────────────────┬───────────────────────────┘ |
71 | | - │ |
72 | | - ┌────────────┴────────────┐ |
73 | | - ▼ ▼ |
74 | | - ┌─────────────┐ ┌─────────────┐ |
75 | | - │ WEB (D3) │ │ TUI (term) │ |
76 | | - │ port 8420 │ │ textual │ |
77 | | - │ │ │ │ |
78 | | - │ FastAPI │ │ 5 tabs: │ |
79 | | - │ WebSocket │ │ dashboard │ |
80 | | - │ 3 tabs: │ │ sessions │ |
81 | | - │ overview │ │ explorer │ |
82 | | - │ sessions │ │ analysis │ |
83 | | - │ analyze │ │ graph │ |
84 | | - └──────┬──────┘ └──────────────┘ |
85 | | - │ |
86 | | - ▼ |
87 | | - ┌──────────────────────────────────────┐ |
88 | | - │ OPTIMIZER (map-reduce-generate) │ |
89 | | - │ │ |
90 | | - │ MAP: per-session LLM calls │ |
91 | | - │ (cached, concurrent) │ |
92 | | - │ intent, spirals, quality │ |
93 | | - │ │ |
94 | | - │ REDUCE: pure python, deterministic │ |
95 | | - │ score 0-100, 5 dimensions │ |
96 | | - │ cost forensics, anti-pats │ |
97 | | - │ │ |
98 | | - │ GENERATE: single LLM call │ |
99 | | - │ profile, recs, insights │ |
100 | | - │ │ |
101 | | - │ LLM: ollama / anthropic / openai │ |
102 | | - └──────────────────────────────────────┘ |
103 | | -``` |
104 | 153 |
|
105 | | -**collectors** parse tool-specific local files into a unified `Session` model (id, tool, project, messages, tokens, cost, prompts, timestamps). each collector handles one tool's quirks — JSONL for Claude, SQLite for Cursor, JSON blobs for Codex. |
| 154 | +## AI Analysis |
| 155 | + |
| 156 | +Optional. Select sessions, run LLM analysis, get a report. |
106 | 157 |
|
107 | | -**web dashboard** is vanilla JS + D3, no frameworks. FastAPI serves the API and static files. WebSocket for live updates. three tabs: overview (knowledge graph + panels), sessions (paginated browser with detail pane), analyze (select sessions → LLM analysis → score + cost forensics + recommendations). |
| 158 | +**Three-phase pipeline (Map-Reduce-Generate):** |
108 | 159 |
|
109 | | -**TUI** is built on textual. plotext for charts. five tabs: dashboard (stats + charts), sessions (project aggregates + history), explorer (interactive search/select/analyze), analysis (model usage + intent distribution), graph (tree view). |
| 160 | +1. **MAP** — batches selected sessions into a single LLM call with full prompt history. Classifies each: intent, correction spirals, prompt quality, wasted effort. Results cached per session ID — sessions are immutable, never re-analyzed. |
110 | 161 |
|
111 | | -**optimizer** is the interesting part. three phases: |
| 162 | +2. **REDUCE** — pure Python, no LLM. Deterministic score from 5 dimensions (0-20 points each): |
112 | 163 |
|
113 | | -1. **MAP** — takes your top 30 sessions (by cost), sends each to an LLM with full prompt history. classifies: intent (debugging/greenfield/exploration/...), had correction spirals?, prompt quality, wasted effort. results cached per session ID at `~/.agenttop/session_cache.json` — sessions are immutable so they're never re-analyzed. max 10 new sessions per run. concurrent: 1 worker for ollama, 4 for cloud. |
| 164 | + | Dimension | Source | Formula | |
| 165 | + |-----------|--------|---------| |
| 166 | + | Session hygiene | MAP classifications | `spiral_free_sessions / total x 20` | |
| 167 | + | Prompt quality | MAP classifications | `no_waste_sessions / total x 20` | |
| 168 | + | Cost efficiency | Python cost forensics | `(1 - waste_pct / 100) x 20` | |
| 169 | + | Cache efficiency | Claude model_usage | `cache_hit_rate / 100 x 20` | |
| 170 | + | Tool utilization | Feature detection | `features_used / available x 20` | |
114 | 171 |
|
115 | | -2. **REDUCE** — pure python. no LLM. computes a deterministic score from 5 dimensions (0-20 points each): |
116 | | - - session hygiene: `sessions_without_spirals / total × 20` |
117 | | - - prompt quality: `sessions_without_waste / total × 20` |
118 | | - - cost efficiency: `(1 - waste_pct/100) × 20` |
119 | | - - cache efficiency: `cache_hit_rate/100 × 20` |
120 | | - - tool utilization: `features_used/features_available × 20` |
| 172 | +3. **GENERATE** — single LLM call with ~2K tokens of pre-computed metrics. LLM writes prose (developer profile, recommendations, project insights). Does NOT compute any numbers — those come from REDUCE. |
121 | 173 |
|
122 | | - also computes cost forensics (spend by project, by model, waste estimation from marathon sessions) and anti-pattern counts. |
| 174 | +Score is fully traceable. "Session hygiene: 14/20 — 23/30 sessions had no correction spirals." |
123 | 175 |
|
124 | | -3. **GENERATE** — single LLM call with ~2K tokens of pre-computed metrics. LLM writes prose (developer profile, recommendations, project insights). it does NOT compute any numbers — those come from REDUCE. |
| 176 | +**LLM providers:** Ollama (free, local — nothing leaves your machine), Anthropic, OpenAI, OpenRouter. |
125 | 177 |
|
126 | | -the score is fully traceable. "session hygiene: 14/20 — 23/30 sessions had no correction spirals." not a vibe check. |
| 178 | +```bash |
| 179 | +agenttop init # interactive setup wizard |
| 180 | +``` |
| 181 | + |
| 182 | +## Demo Mode |
127 | 183 |
|
128 | | -## API |
| 184 | +Safe for recordings and screenshots. Generates realistic fake data — 10 projects, 265 sessions across 5 tools, with handwritten prompts that read like real engineering work. |
129 | 185 |
|
130 | | -| endpoint | what | |
131 | | -|----------|------| |
132 | | -| `GET /api/stats?days=N` | aggregated stats from all collectors | |
133 | | -| `GET /api/sessions?days=N` | all sessions (paginated client-side) | |
134 | | -| `GET /api/sessions/{id}` | full session detail with prompts | |
135 | | -| `GET /api/models` | claude model usage (input/output/cache) | |
136 | | -| `GET /api/hours` | hourly token distribution | |
137 | | -| `GET /api/graph` | D3-compatible knowledge graph | |
138 | | -| `POST /api/analyze-sessions` | LLM analysis on selected sessions | |
139 | | -| `POST /api/optimize` | full optimizer pipeline | |
140 | | -| `GET /api/optimize-stream` | SSE streaming progress + result | |
141 | | -| `WS /ws` | real-time stat updates | |
| 186 | +```bash |
| 187 | +agenttop --demo # terminal with fake data |
| 188 | +agenttop web --demo # web dashboard with fake data |
| 189 | +``` |
142 | 190 |
|
143 | | -## config |
| 191 | +Deterministic. Same screenshots every time. |
144 | 192 |
|
145 | | -zero config by default. `agenttop init` for interactive setup, or: |
| 193 | +## How It Works |
| 194 | + |
| 195 | +``` |
| 196 | +~/.claude/ ~/.cursor/ ~/.codex/ ~/.config/github-copilot/ ~/Library/.../Kiro/ |
| 197 | + | | | | | |
| 198 | + v v v v v |
| 199 | + COLLECTORS — parse tool-specific local files |
| 200 | + │ Claude: JSONL → exact tokens, tool names, model per message |
| 201 | + │ Cursor: SQLite → conversations, AI vs human ratio, models |
| 202 | + │ Codex: JSON + SQLite → prompts, automations, rollouts |
| 203 | + │ Copilot: JSON → session messages, model, agents |
| 204 | + │ Kiro: SQLite → VS Code state keys |
| 205 | + │ |
| 206 | + └──> unified Session model (tool_breakdown, models_used, prompts, tokens, cost) |
| 207 | + │ |
| 208 | + ├──> WEB DASHBOARD (FastAPI + D3 + vanilla JS, port 8420) |
| 209 | + │ overview (knowledge graph) | sessions (paginated) | analyze |
| 210 | + │ |
| 211 | + ├──> TERMINAL DASHBOARD (Textual + Rich) |
| 212 | + │ dashboard | sessions | explorer | analysis | graph |
| 213 | + │ |
| 214 | + └──> OPTIMIZER (Map-Reduce-Generate, optional) |
| 215 | + MAP: batch LLM call, cached per session |
| 216 | + REDUCE: deterministic score 0-100 |
| 217 | + GENERATE: prose recommendations |
| 218 | +``` |
| 219 | + |
| 220 | +## Configuration |
| 221 | + |
| 222 | +Zero config by default. For AI analysis: |
| 223 | + |
| 224 | +```bash |
| 225 | +agenttop init |
| 226 | +``` |
| 227 | + |
| 228 | +or manually: |
146 | 229 |
|
147 | 230 | ```toml |
148 | 231 | # ~/.agenttop/config.toml |
149 | 232 | [llm] |
150 | 233 | provider = "ollama" # ollama | anthropic | openai | openrouter |
151 | | -model = "ollama/gemma3:4b" |
152 | | -map_concurrency = 0 # 0 = auto |
| 234 | +model = "ollama/gemma3:4b" # any litellm-compatible model |
153 | 235 | ``` |
154 | 236 |
|
155 | | -## no telemetry |
| 237 | +Environment variable overrides: `AGENTTOP_LLM_PROVIDER`, `AGENTTOP_LLM_MODEL`, `ANTHROPIC_API_KEY`. |
| 238 | + |
| 239 | +## No Telemetry |
156 | 240 |
|
157 | | -zero. local only. ollama = nothing leaves your machine. |
| 241 | +Zero. No data collection. No cloud uploads. No analytics. Everything runs locally. With Ollama, nothing leaves your machine at all. |
158 | 242 |
|
159 | | -## license |
| 243 | +## License |
160 | 244 |
|
161 | 245 | Apache 2.0 |
| 246 | + |
| 247 | +## Contributors |
| 248 | + |
| 249 | +Built with [@AbhilashSri](https://github.com/AbhilashSri) (workflow intelligence, code reviews), [@Mohit]() and [@Akshit]() (testing, UX). |
0 commit comments