Skip to content

Commit f20e73a

Browse files
vicarious11claude
andcommitted
feat: TUI dashboard overhaul — real tool call data, activity classifier, one-shot rate
Session model gains two new fields: - tool_breakdown: {Edit: 5, Bash: 3, Read: 12} — actual tool names extracted from Claude Code JSONL tool_use content blocks - models_used: {opus-4-6: 8, sonnet-4-6: 12} — per-message model ID Claude collector now extracts tool_use block names from JSONL (was counting but discarding the name field). Activity classifier uses real tool call data when available: - Edit/Write -> coding, Read/Grep -> exploration - Bash with test commands -> testing, git commands -> git_ops - Prompt keywords for debugging/refactoring - Falls back to keyword-only for tools without tool_breakdown Dashboard rewritten with Rich text panels: - Cost by project, cost by model (horizontal bars) - Daily cost sparkline (unicode block chars) - Activity breakdown (8 categories, from real data) - One-shot rate (% of edits passing without retry) - Tool list with status, sessions, tokens, cost - No plotext dependency on dashboard tab README rewritten with full feature documentation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 7fbeb48 commit f20e73a

6 files changed

Lines changed: 881 additions & 303 deletions

File tree

README.md

Lines changed: 203 additions & 115 deletions
Original file line numberDiff line numberDiff line change
@@ -1,161 +1,249 @@
1-
# agenttop
1+
<p align="center">
2+
<img src="assets/logo.png" alt="agenttop" width="120">
3+
</p>
24

3-
`htop` for AI coding agents.
5+
<h1 align="center">agenttop</h1>
46

5-
```bash
6-
git clone https://github.com/vicarious11/agenttop && cd agenttop && ./setup.sh
7-
./run.sh # localhost:8420
8-
```
7+
<p align="center">
8+
<b>See where your AI coding tokens and money actually go.</b>
9+
<br>
10+
<sub>htop for Claude Code, Cursor, Kiro, Codex, and Copilot.</sub>
11+
</p>
912

10-
![agenttop dashboard](assets/screenshots/optimizer.png)
13+
<p align="center">
14+
<a href="#install">Install</a> ·
15+
<a href="#what-you-see">Screenshots</a> ·
16+
<a href="#features">Features</a> ·
17+
<a href="#how-it-works">How it works</a> ·
18+
<a href="#ai-analysis">AI Analysis</a> ·
19+
<a href="#architecture">Architecture</a>
20+
</p>
1121

12-
Monitors **Claude Code**, **Cursor**, **Kiro**, **Codex**, **Copilot**. Reads the local files they already write (`~/.claude/`, `~/.cursor/`, etc). Read-only. Nothing leaves your machine.
22+
<p align="center">
23+
<img src="https://img.shields.io/badge/python-3.10%2B-blue" alt="Python">
24+
<img src="https://img.shields.io/github/license/vicarious11/agenttop" alt="License">
25+
<img src="https://img.shields.io/badge/tools-5%20supported-green" alt="Tools">
26+
<img src="https://img.shields.io/badge/telemetry-zero-brightgreen" alt="No telemetry">
27+
</p>
1328

14-
## what it does
29+
---
1530

16-
- unified dashboard across all your AI coding tools
17-
- every session, every prompt, every token, every dollar — one place
18-
- search sessions by project, sort by cost, view full prompt history
19-
- AI analysis: scores you 0-100 on session hygiene, prompt quality, cost efficiency, cache usage, tool utilization
20-
- cost forensics: spend by project, by model, estimated waste from marathon sessions
21-
- detects anti-patterns: correction spirals, context blowup, repeated prompts, model overkill
31+
![agenttop web dashboard](assets/screenshots/optimizer.png)
2232

23-
## install
33+
> Every AI coding tool stores usage data locally — JSONL logs, SQLite databases, workspace state — but none of them show you the full picture. agenttop reads all of it, normalizes it, and gives you a real-time dashboard with AI-powered analysis. 5 tools. One view. Nothing leaves your machine.
34+
35+
---
36+
37+
## Install
2438

2539
```bash
2640
git clone https://github.com/vicarious11/agenttop && cd agenttop && ./setup.sh
2741
```
2842

29-
or `pip install agenttop`
30-
31-
## run
43+
That's it. Handles Python, venv, deps, everything. Then:
3244

3345
```bash
34-
./run.sh # web dashboard
35-
.venv/bin/agenttop # terminal dashboard
36-
agenttop init # set up LLM for analysis (ollama/anthropic/openai)
46+
source .venv/bin/activate
47+
agenttop # terminal dashboard
48+
agenttop web # web dashboard at localhost:8420
49+
agenttop stats # quick CLI summary
50+
agenttop init # configure LLM for AI analysis
3751
```
3852

39-
## data sources
53+
Requirements: Python 3.10+. No Docker. No API keys needed. macOS, Linux, Windows.
54+
55+
Keyboard: `d` dashboard · `s` sessions · `e` explorer · `a` analysis · `k` graph · `1-4` time range · `q` quit
56+
57+
## What You See
58+
59+
### Terminal Dashboard
4060

4161
```
42-
~/.claude/projects/**/*.jsonl exact token counts per message
43-
~/.cursor/ai-tracking/*.db conversations, models, AI vs human ratio
44-
~/.codex/.codex-global-state.json prompts, automations
45-
~/.config/github-copilot/ session state
46-
~/Library/.../Kiro/state.vscdb workspace data
62+
All time 17.8M tok $687 cost 265 sess 5.6K msgs 5 tools 87% cache
63+
64+
COST BY PROJECT COST BY MODEL
65+
apex-trading-engine ████████ $284 opus-4-6 ████████████ $412
66+
vaultkeeper █████ $148 sonnet-4-6 █████ $198
67+
phantom-search ████ $97 haiku-4-5 ██ $76
68+
neon-ui ██ $63
69+
dataweave ██ $51
70+
71+
DAILY COST (30d) ACTIVITY BREAKDOWN
72+
▁▃▅▇█▇▅▃▁▂▄▆█▇▅▃▁▂▅▇█▇▅▃▁▃▅▇ coding ████████ 42%
73+
total $687 avg $23/day peak $45 debugging ████ 21%
74+
testing ███ 15%
75+
TOOLS exploration ██ 9%
76+
● Claude Code 180 sess $469
77+
● Cursor 45 sess $107 ONE-SHOT RATE
78+
● Kiro 20 sess $53 87% ██████████████████░░
79+
● Codex 12 sess $53 edits that pass first try
80+
● Copilot 8 sess $5 higher = better prompting
4781
```
4882

49-
## architecture
83+
Six panels. No plotext. Pure Rich text rendering. All data computed from actual tool calls, not keyword guessing.
5084

85+
### Web Dashboard
86+
87+
Three tabs: **Overview** · **Sessions** · **Analyze**
88+
89+
- **Overview** — force-directed knowledge graph (D3), model usage (input/output/cache), hourly activity, cost breakdown, workflow intelligence
90+
- **Sessions** — full-page browser with Google-style pagination. Search by project or prompt. Sort by cost, time, tokens. Click any session to see complete prompt history
91+
- **Analyze** — select sessions (All / Last 10 / Top Cost), run LLM analysis, get a deep-dive report with score, grades, cost forensics by project and model, anti-patterns, recommendations with estimated savings
92+
93+
Keyboard: `o` overview · `s` sessions · `a` analyze. URL hash routing (`#sessions`, `#analyze`) for deep links.
94+
95+
## Features
96+
97+
### Data Extraction
98+
99+
| Tool | Data Source | What agenttop extracts |
100+
|------|------------|----------------------|
101+
| **Claude Code** | `~/.claude/projects/**/*.jsonl` | Exact per-message token counts (input, output, cache read, cache create). Per-message model ID. **Every tool call name** (Edit, Bash, Read, Grep, Agent, Write — extracted from `tool_use` content blocks). Up to 50 user prompts per session. Project path from `cwd` field. Cost from per-model pricing. |
102+
| **Cursor** | `~/.cursor/ai-tracking/ai-code-tracking.db` | Conversations from SQLite. Source type (tab/composer/chat). AI vs human code ratio from `scored_commits`. Model per code hash. Project resolution via `ide_state.json` workspace mapping. |
103+
| **Kiro** | `~/Library/.../Kiro/User/globalStorage/state.vscdb` | Session data from VS Code state DB. Keys matching `kiro%`, `chat%`, `session%` patterns. Message counts and timestamps. |
104+
| **Codex** | `~/.codex/` | Prompt history from `.codex-global-state.json`. Session files from `sessions/` rollouts. Automation data from SQLite. Config (model, reasoning effort). |
105+
| **Copilot** | `~/.config/github-copilot/session-state/` | Per-session JSON with message content. Model extraction. Custom agent detection. Token estimation from content length. |
106+
107+
All read-only. agenttop never modifies your tool data.
108+
109+
### Activity Classification
110+
111+
Deterministic. No LLM. Classified from **actual tool call data** when available (Claude Code), falls back to prompt keywords for other tools.
112+
113+
| Activity | How it's detected |
114+
|----------|------------------|
115+
| **coding** | Edit, Write, MultiEdit tool calls |
116+
| **debugging** | Bug/error/fix keywords in prompts + Edit/Bash patterns |
117+
| **testing** | Bash calls with pytest/jest/vitest/cargo test |
118+
| **exploration** | Read, Grep, Glob calls without edits |
119+
| **refactoring** | Refactor/rename/extract keywords + Edit patterns |
120+
| **git ops** | Bash calls with git commands |
121+
| **planning** | EnterPlanMode, TaskCreate, Agent tool calls |
122+
| **other** | Everything else |
123+
124+
### One-Shot Success Rate
125+
126+
Percentage of edit turns that pass without retry. Detects `Edit -> correction prompt -> Edit` retry cycles in your prompt history. Higher percentage = better prompting, less wasted tokens.
127+
128+
When `tool_breakdown` is available (Claude Code), uses actual Edit/Write call counts. Falls back to prompt analysis for other tools.
129+
130+
### Cost Analysis
131+
132+
- **Cost by project** — which project burns the most money, with session count
133+
- **Cost by model** — opus vs sonnet vs haiku spend, computed from actual per-model pricing (input/output/cache rates)
134+
- **Daily cost sparkline** — 30-day unicode trend with total, average, and peak
135+
- **Cache hit rate** — from actual `cacheReadInputTokens` vs `inputTokens` in Claude Code data
136+
137+
### Session Data Model
138+
139+
Each session stores:
140+
141+
```python
142+
Session(
143+
tool_breakdown={"Edit": 5, "Bash": 3, "Read": 12, "Grep": 4}, # actual tool calls
144+
models_used={"claude-opus-4-6": 8, "claude-sonnet-4-6": 12}, # per-message model
145+
prompts=["fix the race condition in...", ...], # up to 50
146+
total_tokens=48291, # exact for Claude, estimated for others
147+
estimated_cost_usd=12.47, # per-model pricing
148+
message_count=23,
149+
tool_call_count=24,
150+
# + id, tool, project, start_time, end_time
151+
)
51152
```
52-
┌──────────────────────────────────────────────┐
53-
│ YOUR MACHINE (read-only) │
54-
│ │
55-
│ ~/.claude/ ~/.cursor/ ~/.codex/ ... │
56-
└──────┬───────────┬────────────┬──────────────┘
57-
│ │ │
58-
▼ ▼ ▼
59-
┌──────────────────────────────────────────────┐
60-
│ COLLECTORS │
61-
│ │
62-
│ ClaudeCodeCollector → JSONL parser │
63-
│ CursorCollector → SQLite + workspace │
64-
│ KiroCollector → VS Code state DB │
65-
│ CodexCollector → JSON + SQLite │
66-
│ CopilotCollector → session JSON │
67-
│ │
68-
│ Each: collect_sessions() → list[Session] │
69-
│ get_stats(days) → ToolStats │
70-
└──────────────────┬───────────────────────────┘
71-
72-
┌────────────┴────────────┐
73-
▼ ▼
74-
┌─────────────┐ ┌─────────────┐
75-
│ WEB (D3) │ │ TUI (term) │
76-
│ port 8420 │ │ textual │
77-
│ │ │ │
78-
│ FastAPI │ │ 5 tabs: │
79-
│ WebSocket │ │ dashboard │
80-
│ 3 tabs: │ │ sessions │
81-
│ overview │ │ explorer │
82-
│ sessions │ │ analysis │
83-
│ analyze │ │ graph │
84-
└──────┬──────┘ └──────────────┘
85-
86-
87-
┌──────────────────────────────────────┐
88-
│ OPTIMIZER (map-reduce-generate) │
89-
│ │
90-
│ MAP: per-session LLM calls │
91-
│ (cached, concurrent) │
92-
│ intent, spirals, quality │
93-
│ │
94-
│ REDUCE: pure python, deterministic │
95-
│ score 0-100, 5 dimensions │
96-
│ cost forensics, anti-pats │
97-
│ │
98-
│ GENERATE: single LLM call │
99-
│ profile, recs, insights │
100-
│ │
101-
│ LLM: ollama / anthropic / openai │
102-
└──────────────────────────────────────┘
103-
```
104153

105-
**collectors** parse tool-specific local files into a unified `Session` model (id, tool, project, messages, tokens, cost, prompts, timestamps). each collector handles one tool's quirks — JSONL for Claude, SQLite for Cursor, JSON blobs for Codex.
154+
## AI Analysis
155+
156+
Optional. Select sessions, run LLM analysis, get a report.
106157

107-
**web dashboard** is vanilla JS + D3, no frameworks. FastAPI serves the API and static files. WebSocket for live updates. three tabs: overview (knowledge graph + panels), sessions (paginated browser with detail pane), analyze (select sessions → LLM analysis → score + cost forensics + recommendations).
158+
**Three-phase pipeline (Map-Reduce-Generate):**
108159

109-
**TUI** is built on textual. plotext for charts. five tabs: dashboard (stats + charts), sessions (project aggregates + history), explorer (interactive search/select/analyze), analysis (model usage + intent distribution), graph (tree view).
160+
1. **MAP** — batches selected sessions into a single LLM call with full prompt history. Classifies each: intent, correction spirals, prompt quality, wasted effort. Results cached per session ID — sessions are immutable, never re-analyzed.
110161

111-
**optimizer** is the interesting part. three phases:
162+
2. **REDUCE** — pure Python, no LLM. Deterministic score from 5 dimensions (0-20 points each):
112163

113-
1. **MAP** — takes your top 30 sessions (by cost), sends each to an LLM with full prompt history. classifies: intent (debugging/greenfield/exploration/...), had correction spirals?, prompt quality, wasted effort. results cached per session ID at `~/.agenttop/session_cache.json` — sessions are immutable so they're never re-analyzed. max 10 new sessions per run. concurrent: 1 worker for ollama, 4 for cloud.
164+
| Dimension | Source | Formula |
165+
|-----------|--------|---------|
166+
| Session hygiene | MAP classifications | `spiral_free_sessions / total x 20` |
167+
| Prompt quality | MAP classifications | `no_waste_sessions / total x 20` |
168+
| Cost efficiency | Python cost forensics | `(1 - waste_pct / 100) x 20` |
169+
| Cache efficiency | Claude model_usage | `cache_hit_rate / 100 x 20` |
170+
| Tool utilization | Feature detection | `features_used / available x 20` |
114171

115-
2. **REDUCE** — pure python. no LLM. computes a deterministic score from 5 dimensions (0-20 points each):
116-
- session hygiene: `sessions_without_spirals / total × 20`
117-
- prompt quality: `sessions_without_waste / total × 20`
118-
- cost efficiency: `(1 - waste_pct/100) × 20`
119-
- cache efficiency: `cache_hit_rate/100 × 20`
120-
- tool utilization: `features_used/features_available × 20`
172+
3. **GENERATE** — single LLM call with ~2K tokens of pre-computed metrics. LLM writes prose (developer profile, recommendations, project insights). Does NOT compute any numbers — those come from REDUCE.
121173

122-
also computes cost forensics (spend by project, by model, waste estimation from marathon sessions) and anti-pattern counts.
174+
Score is fully traceable. "Session hygiene: 14/20 — 23/30 sessions had no correction spirals."
123175

124-
3. **GENERATE** — single LLM call with ~2K tokens of pre-computed metrics. LLM writes prose (developer profile, recommendations, project insights). it does NOT compute any numbers — those come from REDUCE.
176+
**LLM providers:** Ollama (free, local — nothing leaves your machine), Anthropic, OpenAI, OpenRouter.
125177

126-
the score is fully traceable. "session hygiene: 14/20 — 23/30 sessions had no correction spirals." not a vibe check.
178+
```bash
179+
agenttop init # interactive setup wizard
180+
```
181+
182+
## Demo Mode
127183

128-
## API
184+
Safe for recordings and screenshots. Generates realistic fake data — 10 projects, 265 sessions across 5 tools, with handwritten prompts that read like real engineering work.
129185

130-
| endpoint | what |
131-
|----------|------|
132-
| `GET /api/stats?days=N` | aggregated stats from all collectors |
133-
| `GET /api/sessions?days=N` | all sessions (paginated client-side) |
134-
| `GET /api/sessions/{id}` | full session detail with prompts |
135-
| `GET /api/models` | claude model usage (input/output/cache) |
136-
| `GET /api/hours` | hourly token distribution |
137-
| `GET /api/graph` | D3-compatible knowledge graph |
138-
| `POST /api/analyze-sessions` | LLM analysis on selected sessions |
139-
| `POST /api/optimize` | full optimizer pipeline |
140-
| `GET /api/optimize-stream` | SSE streaming progress + result |
141-
| `WS /ws` | real-time stat updates |
186+
```bash
187+
agenttop --demo # terminal with fake data
188+
agenttop web --demo # web dashboard with fake data
189+
```
142190

143-
## config
191+
Deterministic. Same screenshots every time.
144192

145-
zero config by default. `agenttop init` for interactive setup, or:
193+
## How It Works
194+
195+
```
196+
~/.claude/ ~/.cursor/ ~/.codex/ ~/.config/github-copilot/ ~/Library/.../Kiro/
197+
| | | | |
198+
v v v v v
199+
COLLECTORS — parse tool-specific local files
200+
│ Claude: JSONL → exact tokens, tool names, model per message
201+
│ Cursor: SQLite → conversations, AI vs human ratio, models
202+
│ Codex: JSON + SQLite → prompts, automations, rollouts
203+
│ Copilot: JSON → session messages, model, agents
204+
│ Kiro: SQLite → VS Code state keys
205+
206+
└──> unified Session model (tool_breakdown, models_used, prompts, tokens, cost)
207+
208+
├──> WEB DASHBOARD (FastAPI + D3 + vanilla JS, port 8420)
209+
│ overview (knowledge graph) | sessions (paginated) | analyze
210+
211+
├──> TERMINAL DASHBOARD (Textual + Rich)
212+
│ dashboard | sessions | explorer | analysis | graph
213+
214+
└──> OPTIMIZER (Map-Reduce-Generate, optional)
215+
MAP: batch LLM call, cached per session
216+
REDUCE: deterministic score 0-100
217+
GENERATE: prose recommendations
218+
```
219+
220+
## Configuration
221+
222+
Zero config by default. For AI analysis:
223+
224+
```bash
225+
agenttop init
226+
```
227+
228+
or manually:
146229

147230
```toml
148231
# ~/.agenttop/config.toml
149232
[llm]
150233
provider = "ollama" # ollama | anthropic | openai | openrouter
151-
model = "ollama/gemma3:4b"
152-
map_concurrency = 0 # 0 = auto
234+
model = "ollama/gemma3:4b" # any litellm-compatible model
153235
```
154236

155-
## no telemetry
237+
Environment variable overrides: `AGENTTOP_LLM_PROVIDER`, `AGENTTOP_LLM_MODEL`, `ANTHROPIC_API_KEY`.
238+
239+
## No Telemetry
156240

157-
zero. local only. ollama = nothing leaves your machine.
241+
Zero. No data collection. No cloud uploads. No analytics. Everything runs locally. With Ollama, nothing leaves your machine at all.
158242

159-
## license
243+
## License
160244

161245
Apache 2.0
246+
247+
## Contributors
248+
249+
Built with [@AbhilashSri](https://github.com/AbhilashSri) (workflow intelligence, code reviews), [@Mohit]() and [@Akshit]() (testing, UX).

0 commit comments

Comments
 (0)