Reduce indexer memory consumption by ~75% and startup time by ~20%#4016
Closed
shauns wants to merge 1 commit into
Closed
Reduce indexer memory consumption by ~75% and startup time by ~20%#4016shauns wants to merge 1 commit into
shauns wants to merge 1 commit into
Conversation
A comprehensive set of memory and startup optimizations targeting the Ruby indexer, measured against the Shopify core monorepo (~64K Ruby files, ~889K index entries). RSS after indexing dropped from ~2.2 GB to ~550 MB. ## Biggest wins ### Replace character-level PrefixTree with sorted-array binary search The old trie created one node per character of every fully qualified name, resulting in 4.1M Node objects each with a Hash for children — consuming nearly 1 GB. The new implementation uses a sorted array with binary search for prefix matching, reducing the data structure from ~955 MB to ~20 MB. ### Pack Location data as integers, create objects lazily Location objects (storing start/end line/column) are packed into a single 62-bit fixnum stored directly on entries. The Location object is only created when actually accessed. This eliminates 1.5M Location objects that were allocated during indexing but rarely read. ### Remove @configuration from Entry instances Every entry stored a reference to the shared Configuration. Moving this to a module-level accessor eliminated one instance variable per entry. This pushed Method and Class entries below Ruby's shape capacity threshold, dropping them from 160 bytes to 80 bytes each — halving memory for the two largest entry types (388K methods, 76K classes). ### Minimize Entry object shapes By deferring initialization of @comments (nil during indexing) and @visibility (:public for 83% of entries), the common-case Entry shape has fewer instance variables, letting Ruby allocate smaller objects. ## Other optimizations - **String deduplication**: Use String#-@ to intern entry names, nesting components, module operation names, parent class names, and URI strings - **Shared Signature singletons**: 221K parameterless methods share a single frozen Signature instance instead of each creating their own - **Parameter interning**: Cache Parameter objects by (type, name) so methods with the same parameter names share objects - **Deferred prefix tree build**: Skip incremental tree insertion during initial indexing; build in one pass at the end - **Use Prism.parse_file**: Avoid creating Ruby string for file contents during initial indexing - **External store for entries PrefixTree**: The entries prefix tree references the @entries hash directly instead of duplicating it - **Aggressive post-indexing GC**: 3 rounds of GC.start + GC.compact after indexing to minimize heap fragmentation ## Results (Shopify core monorepo) | Metric | Before | After | Change | |--------|--------|-------|--------| | RSS after indexing | 2,217 MB | ~548 MB | **-75%** | | Startup time | 49.3s | ~39s | **-21%** | | Index entries | 889K | 889K | unchanged | | Files indexed | 64K | 64K | unchanged | All 18,333 existing tests continue to pass.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
.