Skip to content

Commit fbe3cf4

Browse files
committed
feat: add hooks for auto-update prompts and change classification logic
- Introduced hooks in `hooks.json` for PostToolUse and SessionStart events to prompt users about knowledge graph updates based on commit and session changes. - Implemented `change-classifier.ts` to classify updates based on structural changes, including SKIP, PARTIAL_UPDATE, ARCHITECTURE_UPDATE, and FULL_UPDATE actions. - Added comprehensive tests for change classification in `change-classifier.test.ts` to ensure correct behavior across various scenarios. - Created `fingerprint.ts` to manage file fingerprints, including content hashing, structural analysis, and comparison of fingerprints to detect changes. - Developed tests for fingerprint extraction and comparison in `fingerprint.test.ts` to validate functionality and ensure accurate change detection.
1 parent d539fd4 commit fbe3cf4

File tree

10 files changed

+1362
-1
lines changed

10 files changed

+1362
-1
lines changed
Lines changed: 226 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,226 @@
1+
# Auto-Update Knowledge Graph (Internal — Hook-Triggered)
2+
3+
Incrementally update the knowledge graph using deterministic structural fingerprinting to minimize token usage. This prompt is triggered automatically by the post-commit hook when `autoUpdate` is enabled. It is NOT a user-facing skill.
4+
5+
**Key principle:** Spend zero LLM tokens when changes are cosmetic (formatting, internal logic). Only invoke LLM agents when structural changes (new/removed functions, classes, imports, exports) are detected.
6+
7+
---
8+
9+
## Phase 0 — Pre-flight (Zero Token Cost)
10+
11+
1. Set `PROJECT_ROOT` to the current working directory.
12+
13+
2. Check that `$PROJECT_ROOT/.understand-anything/knowledge-graph.json` exists.
14+
- If not: report "No existing knowledge graph found. Run `/understand` first to create one." and **STOP**.
15+
16+
3. Check that `$PROJECT_ROOT/.understand-anything/meta.json` exists and read `gitCommitHash`.
17+
- If not: report "No analysis metadata found. Run `/understand` to create a baseline." and **STOP**.
18+
19+
4. Get current commit hash:
20+
```bash
21+
git rev-parse HEAD
22+
```
23+
24+
5. If commit hashes match and `--force` is NOT in `$ARGUMENTS`: report "Knowledge graph is already up to date." and **STOP**.
25+
26+
6. Get changed files:
27+
```bash
28+
git diff <lastCommitHash>..HEAD --name-only
29+
```
30+
If no files changed: update `meta.json` with the new commit hash and **STOP**.
31+
32+
7. Filter to source files only (`.ts`, `.tsx`, `.js`, `.jsx`, `.py`, `.go`, `.rs`, `.java`, `.rb`, `.cpp`, `.c`, `.h`, `.cs`, `.swift`, `.kt`, `.php`).
33+
If no source files changed: update `meta.json` with the new commit hash, report "Only non-source files changed. Metadata updated." and **STOP**.
34+
35+
8. Create intermediate directory:
36+
```bash
37+
mkdir -p $PROJECT_ROOT/.understand-anything/intermediate
38+
```
39+
40+
---
41+
42+
## Phase 1 — Structural Fingerprint Check (Zero LLM Tokens)
43+
44+
This phase runs a deterministic Node.js script that compares file structures against stored fingerprints. It costs **zero LLM tokens** — only the script execution cost.
45+
46+
1. Write and execute a Node.js script (`$PROJECT_ROOT/.understand-anything/intermediate/fingerprint-check.mjs`):
47+
48+
```javascript
49+
// The script should:
50+
// 1. Read fingerprints.json from .understand-anything/fingerprints.json
51+
// 2. For each changed source file:
52+
// a. Read the file content
53+
// b. Compute SHA-256 content hash
54+
// c. If content hash matches stored hash → NONE (skip)
55+
// d. Extract structural elements via regex:
56+
// - Functions: match patterns like `function NAME(`, `const NAME = (`, `export function NAME(`
57+
// - Classes: match `class NAME`, `export class NAME`
58+
// - Imports: match `import ... from '...'`, `import '...'`
59+
// - Exports: match `export { ... }`, `export default`, `export function`, `export class`, `export const`
60+
// e. Compare extracted elements against stored fingerprint
61+
// f. Classify as NONE, COSMETIC, or STRUCTURAL
62+
// 3. For new files (not in fingerprints.json): classify as STRUCTURAL
63+
// 4. For deleted files (in fingerprints.json but not on disk): classify as STRUCTURAL
64+
// 5. Determine overall decision:
65+
// - All NONE/COSMETIC → action: "SKIP"
66+
// - Some STRUCTURAL, ≤10 files, same directories → action: "PARTIAL_UPDATE"
67+
// - New/deleted directories or >10 structural files → action: "ARCHITECTURE_UPDATE"
68+
// - >30 structural files or >50% of graph → action: "FULL_UPDATE"
69+
// 6. Write result to .understand-anything/intermediate/change-analysis.json
70+
```
71+
72+
The output JSON should have this shape:
73+
```json
74+
{
75+
"action": "SKIP | PARTIAL_UPDATE | ARCHITECTURE_UPDATE | FULL_UPDATE",
76+
"filesToReanalyze": ["src/new-feature.ts"],
77+
"rerunArchitecture": false,
78+
"rerunTour": false,
79+
"reason": "1 file has structural changes (new function added)",
80+
"fileChanges": [
81+
{ "filePath": "src/utils.ts", "changeLevel": "COSMETIC", "details": ["internal logic changed"] },
82+
{ "filePath": "src/new-feature.ts", "changeLevel": "STRUCTURAL", "details": ["new function: handleRequest"] }
83+
]
84+
}
85+
```
86+
87+
2. Read `.understand-anything/intermediate/change-analysis.json`.
88+
89+
3. **Decision gate:**
90+
91+
| Action | What to do |
92+
|---|---|
93+
| `SKIP` | Update `meta.json` with new commit hash. Report: "No structural changes detected. Graph metadata updated. Zero tokens spent." **STOP.** |
94+
| `FULL_UPDATE` | Report: "Major structural changes detected (reason). Recommend running `/understand --full` for a complete rebuild." **STOP.** |
95+
| `PARTIAL_UPDATE` | Proceed to Phase 2 with `filesToReanalyze` |
96+
| `ARCHITECTURE_UPDATE` | Proceed to Phase 2 with `filesToReanalyze`, flag architecture re-run |
97+
98+
---
99+
100+
## Phase 2 — Targeted Re-Analysis (Minimal Token Cost)
101+
102+
Only re-analyze files with structural changes. This is the **only** phase that costs LLM tokens.
103+
104+
1. Read the existing knowledge graph from `$PROJECT_ROOT/.understand-anything/knowledge-graph.json`.
105+
106+
2. Batch the files from `filesToReanalyze` (from Phase 1). Use a single batch if ≤10 files, otherwise batch into groups of 5-10.
107+
108+
3. For each batch, dispatch a subagent using the prompt template at `../skills/understand/file-analyzer-prompt.md`. Read the template file and pass the full content as the subagent's prompt, appending:
109+
110+
> **Additional context from main session:**
111+
>
112+
> Project: `<projectName from existing graph>``<projectDescription>`
113+
> Frameworks detected: `<frameworks from existing graph>`
114+
> Languages: `<languages from existing graph>`
115+
>
116+
> **IMPORTANT:** This is an incremental update. Only the files listed below have structural changes. Analyze them thoroughly but do not invent nodes for files not in this batch.
117+
118+
Fill in batch-specific parameters:
119+
120+
> Analyze these source files and produce GraphNode and GraphEdge objects.
121+
> Project root: `$PROJECT_ROOT`
122+
> Project: `<projectName>`
123+
> Languages: `<languages>`
124+
> Batch index: `1`
125+
> Write output to: `$PROJECT_ROOT/.understand-anything/intermediate/batch-1.json`
126+
>
127+
> All project files (for import resolution):
128+
> `<file list from existing graph nodes>`
129+
>
130+
> Files to analyze in this batch:
131+
> 1. `<path>` (`<sizeLines>` lines)
132+
> ...
133+
134+
4. After batch(es) complete, read each `batch-<N>.json` and merge results.
135+
136+
5. **Merge with existing graph:**
137+
- Remove old nodes whose `filePath` matches any file in `filesToReanalyze` or in the deleted files list
138+
- Remove old edges whose `source` or `target` references a removed node
139+
- Add new nodes and edges from the fresh analysis
140+
- Deduplicate nodes by ID (keep latest), edges by `source + target + type`
141+
- Remove any edge with dangling `source` or `target` references
142+
143+
---
144+
145+
## Phase 3 — Conditional Architecture/Tour + Save
146+
147+
### 3a. Architecture update (only if `rerunArchitecture === true`)
148+
149+
If the change analysis flagged `ARCHITECTURE_UPDATE`:
150+
151+
1. Dispatch a subagent using the prompt template at `../skills/understand/architecture-analyzer-prompt.md`, passing the full merged node set and import edges. Include previous layer definitions for naming consistency:
152+
153+
> Previous layer definitions (for naming consistency):
154+
> ```json
155+
> [previous layers from existing graph]
156+
> ```
157+
> Maintain the same layer names and IDs where possible. Only add/remove layers if the file structure has materially changed.
158+
159+
2. After completion, read and normalize layers (same normalization as `/understand` Phase 4).
160+
161+
3. Optionally re-run tour builder if layers changed significantly.
162+
163+
### 3b. Lite layer update (if `rerunArchitecture === false`)
164+
165+
If only a partial update:
166+
1. For **new files**: assign them to the most likely existing layer based on directory path matching
167+
2. For **deleted files**: remove their IDs from layer `nodeIds` arrays
168+
3. Remove any layer that ends up with zero nodeIds
169+
170+
### 3c. Lite validation
171+
172+
Perform lightweight validation (no graph-reviewer agent):
173+
1. Remove any edge with dangling `source` or `target`
174+
2. Remove any layer `nodeIds` entry that doesn't exist in the node set
175+
3. Ensure every file node appears in exactly one layer (add to a catch-all layer if missing)
176+
177+
### 3d. Save
178+
179+
1. Write the final knowledge graph to `$PROJECT_ROOT/.understand-anything/knowledge-graph.json`.
180+
181+
2. Write updated metadata to `$PROJECT_ROOT/.understand-anything/meta.json`:
182+
```json
183+
{
184+
"lastAnalyzedAt": "<ISO 8601 timestamp>",
185+
"gitCommitHash": "<current commit hash>",
186+
"version": "1.0.0",
187+
"analyzedFiles": <total file count in graph>
188+
}
189+
```
190+
191+
3. **Update fingerprints:** Write and execute a Node.js script that:
192+
- Reads the existing `fingerprints.json`
193+
- For each re-analyzed file: computes new content hash and extracts structural elements via regex
194+
- For deleted files: removes their entries
195+
- Merges with existing fingerprints (keep unchanged files as-is)
196+
- Writes updated `fingerprints.json`
197+
198+
4. Clean up intermediate files:
199+
```bash
200+
rm -rf $PROJECT_ROOT/.understand-anything/intermediate
201+
```
202+
203+
5. Report a summary:
204+
- Files checked: N (total changed)
205+
- Structural changes found: N files
206+
- Cosmetic-only changes: N files (skipped)
207+
- Nodes updated: N
208+
- Action taken: PARTIAL_UPDATE / ARCHITECTURE_UPDATE
209+
- Path to output: `$PROJECT_ROOT/.understand-anything/knowledge-graph.json`
210+
211+
---
212+
213+
## Error Handling
214+
215+
- If the fingerprint check script fails: fall back to treating all changed files as STRUCTURAL (conservative approach).
216+
- If `fingerprints.json` doesn't exist: treat all changed files as STRUCTURAL and regenerate fingerprints after the update.
217+
- If a subagent dispatch fails: retry once. If it fails again, save partial results and report the error.
218+
- ALWAYS save partial results — a partially updated graph is better than no update.
219+
220+
---
221+
222+
## Notes
223+
224+
- This skill reuses the same `file-analyzer-prompt.md` and `architecture-analyzer-prompt.md` as `/understand` — no separate agent prompts needed.
225+
- The fingerprint comparison in Phase 1 uses regex-based extraction (not tree-sitter) because it runs as a temporary Node.js script and doesn't need full AST accuracy — just signature-level detection.
226+
- The authoritative fingerprints stored in `fingerprints.json` are generated by `/understand` Phase 7 using the core `fingerprint.ts` module (which uses tree-sitter for precise extraction).
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
{
2+
"hooks": {
3+
"PostToolUse": [
4+
{
5+
"matcher": "Bash",
6+
"hooks": [
7+
{
8+
"type": "command",
9+
"command": "echo \"$TOOL_INPUT\" | grep -qE 'git\\s+(commit|merge|cherry-pick|rebase)' && [ -f .understand-anything/config.json ] && grep -q '\"autoUpdate\".*true' .understand-anything/config.json && [ -f .understand-anything/knowledge-graph.json ] && echo '[understand-anything] Commit detected with auto-update enabled. You MUST read the file at ${PLUGIN_DIR}/hooks/auto-update-prompt.md and execute its instructions to incrementally update the knowledge graph. Do not ask the user for confirmation — just do it.' || true"
10+
}
11+
]
12+
}
13+
],
14+
"SessionStart": [
15+
{
16+
"hooks": [
17+
{
18+
"type": "command",
19+
"command": "[ -f .understand-anything/config.json ] && grep -q '\"autoUpdate\".*true' .understand-anything/config.json && [ -f .understand-anything/meta.json ] && [ -f .understand-anything/knowledge-graph.json ] && [ \"$(sed -n 's/.*\"gitCommitHash\".*\"\\([^\"]*\\)\".*/\\1/p' .understand-anything/meta.json)\" != \"$(git rev-parse HEAD 2>/dev/null)\" ] && echo '[understand-anything] Knowledge graph is stale. You MUST read the file at ${PLUGIN_DIR}/hooks/auto-update-prompt.md and execute its instructions to check for structural changes and update the graph. Do not ask the user for confirmation — just do it.' || true"
20+
}
21+
]
22+
}
23+
]
24+
}
25+
}

0 commit comments

Comments
 (0)