|
| 1 | +# Semantic diff feasibility for Hunk |
| 2 | + |
| 3 | +## Recommendation |
| 4 | + |
| 5 | +**Do not replace Hunk's default Pierre-based diff pipeline.** If we pursue semantic diffs, the safest path is an **optional backend** behind a flag such as `--semantic`, with normal Pierre line diffs remaining the default. |
| 6 | + |
| 7 | +Recommended order: |
| 8 | + |
| 9 | +1. **Prototype an adapter around difftastic** for supported files only. |
| 10 | +2. Keep clear fallback to Hunk's existing Pierre/text diff path. |
| 11 | +3. Only consider an in-process semantic engine if the prototype proves the UX win is worth the added startup, packaging, and maintenance cost. |
| 12 | + |
| 13 | +## Why this should be optional, not default |
| 14 | + |
| 15 | +Hunk is optimized for: |
| 16 | + |
| 17 | +- fast startup |
| 18 | +- multi-file review streams |
| 19 | +- responsive keyboard/mouse navigation |
| 20 | +- agent-note anchoring by file/hunk/line |
| 21 | +- predictable split/stack rendering from one normalized diff model |
| 22 | + |
| 23 | +Semantic diffing helps most when a reviewer is dealing with: |
| 24 | + |
| 25 | +- heavy reformatting |
| 26 | +- moved expressions within a line or nested structure |
| 27 | +- syntax where line diffs are especially noisy |
| 28 | + |
| 29 | +But it also adds real costs: |
| 30 | + |
| 31 | +- slower startup on supported languages |
| 32 | +- more parser/runtime dependencies |
| 33 | +- more fallback cases |
| 34 | +- more work to preserve Hunk's current review-stream semantics |
| 35 | + |
| 36 | +That makes semantic diff a better **mode** than a new universal default. |
| 37 | + |
| 38 | +## How difftastic works today |
| 39 | + |
| 40 | +From the difftastic README and source: |
| 41 | + |
| 42 | +- It detects language from file path/source hints. |
| 43 | +- It parses supported files with **tree-sitter**. |
| 44 | +- It converts each parse tree into its own simplified syntax model made of **lists** and **atoms**. |
| 45 | +- It computes a structural diff as a **graph problem** and uses **Dijkstra's algorithm** to find a low-cost path through the two syntax trees. |
| 46 | +- It applies language-aware cleanup passes such as its **slider** logic so the final diff matches human intuition better. |
| 47 | +- It falls back to **line-oriented diffing** when: |
| 48 | + - the language is unsupported |
| 49 | + - a file exceeds `--byte-limit` |
| 50 | + - parse errors exceed `--parse-error-limit` |
| 51 | + - the structural diff graph exceeds `--graph-limit` |
| 52 | + |
| 53 | +Relevant implementation details from difftastic's source: |
| 54 | + |
| 55 | +- `src/parse/tree_sitter_parser.rs` |
| 56 | + - defines per-language parser config |
| 57 | + - maintains delimiter tokens and forced atom nodes |
| 58 | + - supports limited embedded sub-languages (for example HTML subtrees) |
| 59 | +- `src/diff/graph.rs` |
| 60 | + - builds the tree-diff graph |
| 61 | +- `src/diff/dijkstra.rs` |
| 62 | + - runs shortest-path search with a graph-size cap |
| 63 | +- `src/diff/sliders.rs` |
| 64 | + - adjusts valid-but-ugly structural matches into more readable ones |
| 65 | +- `src/line_parser.rs` |
| 66 | + - supplies the text fallback path |
| 67 | +- `src/main.rs` |
| 68 | + - wires together language detection, parse/fallback decisions, and rendering |
| 69 | + |
| 70 | +## What seems reusable vs not reusable |
| 71 | + |
| 72 | +### Reusable ideas |
| 73 | + |
| 74 | +These are the parts Hunk can learn from directly: |
| 75 | + |
| 76 | +- **Tree-sitter as the parser boundary** |
| 77 | +- **Per-language delimiter/atom tuning** |
| 78 | +- **Hard fallback limits** for byte size, parse errors, and graph size |
| 79 | +- **Optional semantic mode** instead of forcing structural diff everywhere |
| 80 | +- **Language-specific post-processing** to make structural diffs match reviewer intuition |
| 81 | +- **Sub-language parsing** for embedded code blocks where practical |
| 82 | + |
| 83 | +### Potentially reusable implementation surface |
| 84 | + |
| 85 | +There are two practical ways to reuse difftastic itself: |
| 86 | + |
| 87 | +#### 1. Shell out to the `difft` binary |
| 88 | + |
| 89 | +Pros: |
| 90 | + |
| 91 | +- fastest prototype |
| 92 | +- immediately benefits from difftastic's mature parser and matcher |
| 93 | +- no need to port graph/AST logic into Bun/TypeScript first |
| 94 | + |
| 95 | +Cons: |
| 96 | + |
| 97 | +- adds an external runtime dependency |
| 98 | +- requires process spawning per file or per diff |
| 99 | +- machine-readable output is **explicitly unstable** today (`DFT_UNSTABLE=yes` for JSON output) |
| 100 | +- hard to make this feel like a first-class built-in Hunk capability if the binary is missing |
| 101 | + |
| 102 | +#### 2. Build or vendor a dedicated in-process semantic backend |
| 103 | + |
| 104 | +Pros: |
| 105 | + |
| 106 | +- full control over data model and caching |
| 107 | +- easier to integrate tightly with Hunk's renderer and note model |
| 108 | +- no external binary requirement |
| 109 | + |
| 110 | +Cons: |
| 111 | + |
| 112 | +- effectively a new diff engine project |
| 113 | +- large parser/dependency footprint |
| 114 | +- higher startup and maintenance burden |
| 115 | +- we'd be rebuilding a lot of difftastic's hard-won language heuristics |
| 116 | + |
| 117 | +### What is **not** directly reusable |
| 118 | + |
| 119 | +Difftastic's current terminal output is not the right abstraction for Hunk. Hunk needs structured data, not rendered text. |
| 120 | + |
| 121 | +Even difftastic's JSON mode is currently best treated as a **prototype input**, not a stable long-term contract. |
| 122 | + |
| 123 | +So the reusable asset is mainly: |
| 124 | + |
| 125 | +- the semantic engine behavior |
| 126 | +- the parser heuristics |
| 127 | +- the fallback strategy |
| 128 | +- the idea of a structured semantic row/chunk format |
| 129 | + |
| 130 | +not its current CLI presentation layer. |
| 131 | + |
| 132 | +## What this means for Hunk's architecture |
| 133 | + |
| 134 | +Hunk currently normalizes everything into a `DiffFile` with: |
| 135 | + |
| 136 | +- `patch` |
| 137 | +- `stats` |
| 138 | +- `language` |
| 139 | +- Pierre `FileDiffMetadata` |
| 140 | +- optional agent file context |
| 141 | + |
| 142 | +That works well because both split and stack rendering derive from one line/hunk-oriented model. |
| 143 | + |
| 144 | +Semantic diff support would need **one more abstraction layer** before rendering. |
| 145 | + |
| 146 | +### Likely new model shape |
| 147 | + |
| 148 | +Instead of making the UI talk directly to Pierre metadata forever, Hunk would likely need something like: |
| 149 | + |
| 150 | +- `backend: "pierre" | "semantic"` |
| 151 | +- per-file normalized sections/chunks |
| 152 | +- row-level display tokens for split/stack |
| 153 | +- stable line mappings back to old/new line numbers |
| 154 | +- hunk/chunk ids that agent notes and navigation can target |
| 155 | + |
| 156 | +In other words, semantic diff support is not just a loader change. It pushes Hunk toward a more explicit **render-model layer** above the raw diff engine. |
| 157 | + |
| 158 | +## UX fit with Hunk |
| 159 | + |
| 160 | +### What fits well |
| 161 | + |
| 162 | +Semantic diff could fit Hunk well when used as: |
| 163 | + |
| 164 | +- an optional review mode |
| 165 | +- a per-file backend for supported languages |
| 166 | +- a way to reduce noise in refactors/reformatting-heavy reviews |
| 167 | + |
| 168 | +Hunk's current strengths still map cleanly: |
| 169 | + |
| 170 | +- sidebar stays file-oriented |
| 171 | +- main pane stays a top-to-bottom multi-file review stream |
| 172 | +- split and stack layouts can remain terminal-native |
| 173 | +- agent notes can still live beside the code if we preserve line/chunk anchors |
| 174 | + |
| 175 | +### What gets harder |
| 176 | + |
| 177 | +These parts become materially harder: |
| 178 | + |
| 179 | +- line-number accuracy when structure and display rows diverge |
| 180 | +- stable hunk ids for `[` and `]` navigation |
| 181 | +- agent-note anchoring when semantic chunks do not correspond 1:1 to patch hunks |
| 182 | +- caching and lazy loading without hurting startup |
| 183 | +- preserving pager-mode simplicity |
| 184 | + |
| 185 | +## Performance and startup implications |
| 186 | + |
| 187 | +This is the biggest product risk. |
| 188 | + |
| 189 | +Difftastic itself documents performance as a known weakness on files with many changes. Its source also shows why: |
| 190 | + |
| 191 | +- tree-sitter parse work on both sides |
| 192 | +- conversion into a custom syntax tree |
| 193 | +- graph construction |
| 194 | +- shortest-path search with explicit graph limits |
| 195 | +- post-processing passes for readability |
| 196 | + |
| 197 | +For Hunk, that means: |
| 198 | + |
| 199 | +- worse cold start than today's Pierre path on supported files |
| 200 | +- more variance based on language/parser quality |
| 201 | +- possible worst-case cliffs on large, churn-heavy diffs |
| 202 | +- more work to keep the review stream interactive while semantic results load |
| 203 | + |
| 204 | +A semantic mode therefore probably needs: |
| 205 | + |
| 206 | +- hard per-file size/change limits |
| 207 | +- async/lazy loading per file |
| 208 | +- visible fallback behavior |
| 209 | +- benchmarking against real Hunk review workloads, not just one file in isolation |
| 210 | + |
| 211 | +## Dependency, packaging, and maintenance tradeoffs |
| 212 | + |
| 213 | +### External difftastic backend |
| 214 | + |
| 215 | +- Hunk stays relatively small |
| 216 | +- install story gets worse unless `difft` is optional |
| 217 | +- compiled Hunk binary would no longer be self-contained for semantic mode |
| 218 | + |
| 219 | +### In-process tree-sitter backend |
| 220 | + |
| 221 | +- much larger dependency surface |
| 222 | +- likely more bundled grammars or parser assets |
| 223 | +- more binary size and startup cost |
| 224 | +- more parser breakage to own over time |
| 225 | + |
| 226 | +### Licensing |
| 227 | + |
| 228 | +- Hunk is MIT |
| 229 | +- difftastic is MIT |
| 230 | +- difftastic's vendored parsers include a mix of MIT and Apache licenses |
| 231 | + |
| 232 | +That does not look like a blocker, but it does mean vendoring/parser packaging would need care. |
| 233 | + |
| 234 | +## Best product shape for Hunk |
| 235 | + |
| 236 | +The best product shape appears to be: |
| 237 | + |
| 238 | +### Phase 1: optional experimental mode |
| 239 | + |
| 240 | +- `hunk git --semantic` |
| 241 | +- enabled only for supported text languages |
| 242 | +- falls back per file to today's Pierre path when unsupported/too large/too slow |
| 243 | +- probably start with **split view only** if needed, then add stack once the model settles |
| 244 | + |
| 245 | +### Phase 2: per-file backend selection in the review stream |
| 246 | + |
| 247 | +- preserve one review stream |
| 248 | +- some files can be semantic, others Pierre/text fallback |
| 249 | +- keep sidebar ordering and navigation unchanged |
| 250 | + |
| 251 | +### Phase 3: smarter defaults |
| 252 | + |
| 253 | +Only if performance and correctness are good enough: |
| 254 | + |
| 255 | +- auto-enable semantic mode for specific languages or small files |
| 256 | +- keep an easy global off switch |
| 257 | + |
| 258 | +## Practical implementation options |
| 259 | + |
| 260 | +### Option A: prototype via `difft --display json` (recommended first) |
| 261 | + |
| 262 | +Use difftastic as an external engine and translate its JSON output into a Hunk-specific semantic render model. |
| 263 | + |
| 264 | +Pros: |
| 265 | + |
| 266 | +- smallest implementation investment |
| 267 | +- best way to validate UX value quickly |
| 268 | +- lets us answer whether semantic diffs are worth deeper investment |
| 269 | + |
| 270 | +Cons: |
| 271 | + |
| 272 | +- unstable upstream JSON contract today |
| 273 | +- external binary dependency |
| 274 | +- likely awkward error/fallback handling |
| 275 | + |
| 276 | +This is still the best first step. |
| 277 | + |
| 278 | +### Option B: upstream/stabilize a machine-readable difftastic API |
| 279 | + |
| 280 | +Instead of treating difftastic JSON as a private interface, push toward a stable output contract or library boundary. |
| 281 | + |
| 282 | +Pros: |
| 283 | + |
| 284 | +- better long-term reuse story |
| 285 | +- less risk of Hunk chasing CLI-output changes |
| 286 | + |
| 287 | +Cons: |
| 288 | + |
| 289 | +- depends on upstream collaboration |
| 290 | +- slower path to a prototype |
| 291 | + |
| 292 | +### Option C: build Hunk's own semantic backend |
| 293 | + |
| 294 | +Likely only worth considering after Phase 1 proves semantic diff is important to the product. |
| 295 | + |
| 296 | +Pros: |
| 297 | + |
| 298 | +- best long-term integration |
| 299 | +- full control over model and performance tradeoffs |
| 300 | + |
| 301 | +Cons: |
| 302 | + |
| 303 | +- highest cost by far |
| 304 | +- likely months of work if language coverage matters |
| 305 | + |
| 306 | +## Suggested incremental path |
| 307 | + |
| 308 | +1. **Add a design-only abstraction** in Hunk for multiple diff backends at the file level. |
| 309 | +2. **Prototype a hidden adapter** that shells out to `difft --display json` for `hunk diff <left> <right>` on one supported file. |
| 310 | +3. Normalize that into a Hunk-owned semantic row/chunk model. |
| 311 | +4. Render it in **split view first**. |
| 312 | +5. Measure: |
| 313 | + - startup latency |
| 314 | + - per-file load latency |
| 315 | + - huge-file fallback behavior |
| 316 | + - navigation correctness |
| 317 | + - note anchoring quality |
| 318 | +6. If the UX win is real, extend to stack view and multi-file review streams. |
| 319 | +7. Only then decide whether to keep the external dependency, push on upstream integration, or invest in an in-process engine. |
| 320 | + |
| 321 | +## Bottom line |
| 322 | + |
| 323 | +Semantic diff support looks **worth exploring**, but not as a default replacement for Hunk's current diff engine. |
| 324 | + |
| 325 | +The best next step is a **small experimental PR or branch** that treats difftastic as an optional backend and proves three things: |
| 326 | + |
| 327 | +1. the review UX is materially better on real refactor-heavy diffs |
| 328 | +2. line/chunk anchoring still works for Hunk's navigation and notes |
| 329 | +3. startup and per-file latency stay acceptable with strict fallbacks |
| 330 | + |
| 331 | +If those three do not hold, Hunk should keep semantic diff as a research path rather than a shipped core feature. |
0 commit comments