Skip to content

Speed up snapshot JSON generation#325

Merged
justrach merged 1 commit intorelease/0.2.579from
fix/snapshot-json-performance
Apr 26, 2026
Merged

Speed up snapshot JSON generation#325
justrach merged 1 commit intorelease/0.2.579from
fix/snapshot-json-performance

Conversation

@justrach
Copy link
Copy Markdown
Owner

Summary

  • avoid sorting outline paths twice by reusing one sorted path list for tree + outlines
  • serialize the maintained Explorer.symbol_index directly instead of rebuilding a temporary symbol map for every snapshot call
  • write the tree directly as escaped JSON instead of allocating a standalone tree string first
  • speed up JSON string escaping by appending unescaped spans in chunks
  • strengthen the snapshot JSON test to assert tree and symbol-index content

Local benchmark

Compared this branch against origin/release/0.2.579 with scripts/compare-bench.py:

  • codedb_snapshot: 938750 ns -> 575900 ns (-38.65%, -362850 ns)

The benchmark corpus includes src/snapshot_json.zig, so the snapshot payload grew slightly from the code change itself, but latency still dropped substantially.

Validation

  • zig build test
  • zig build
  • python3 scripts/run-bench-json.py /tmp/codedb-head-bench.json
  • base comparison via /tmp/codedb-release-base at origin/release/0.2.579

@github-actions
Copy link
Copy Markdown

Benchmark Regression Report

Thresholds: 10.00% and 50,000 ns absolute delta

NOISE means the percentage threshold was exceeded, but the absolute delta was too small to fail CI.

Tool Base (ns) Head (ns) Delta Abs Delta (ns) Status
codedb_bundle 479609 497116 +3.65% +17507 OK
codedb_changes 53038 56938 +7.35% +3900 OK
codedb_deps 9422 9470 +0.51% +48 OK
codedb_edit 6383 6307 -1.19% -76 OK
codedb_find 61150 69012 +12.86% +7862 NOISE
codedb_hot 102501 98519 -3.88% -3982 OK
codedb_outline 245858 251344 +2.23% +5486 OK
codedb_read 87445 94395 +7.95% +6950 OK
codedb_search 184574 176955 -4.13% -7619 OK
codedb_snapshot 2835681 1351605 -52.34% -1484076 OK
codedb_status 210677 212565 +0.90% +1888 OK
codedb_symbol 59665 59284 -0.64% -381 OK
codedb_tree 70447 79217 +12.45% +8770 NOISE
codedb_word 71140 69147 -2.80% -1993 OK

@justrach justrach merged commit 47a2f1a into release/0.2.579 Apr 26, 2026
1 check passed
@justrach justrach deleted the fix/snapshot-json-performance branch April 26, 2026 03:32
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f6f12b6c40

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/snapshot_json.zig
Comment on lines +77 to 78
var ski = explorer.symbol_index.iterator();
while (ski.next()) |e| try sym_keys.append(alloc, e.key_ptr.*);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Rebuild symbol index from outlines before snapshot emit

This change serializes symbol_index directly from explorer.symbol_index, but that map is not guaranteed complete after fast snapshot restore: insertRestoredFile (src/snapshot.zig) restores outlines/content without calling rebuildSymbolIndexFor, and findAllSymbols explicitly documents that symbol_index can be incomplete after this path (src/explore.zig, comment around lines 1325-1328). In warm-start loads, many unchanged files follow that restore path, so buildSnapshot now returns a partial or empty symbol_index even when outlines are present. Please retain an outline-based fallback (as before) when emitting snapshot symbol data.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant