Skip to content

replace position vectors with bitset iterators (-23.58%)#3

Merged
friendlymatthew merged 1 commit into
mainfrom
bit-iter
Mar 26, 2026
Merged

replace position vectors with bitset iterators (-23.58%)#3
friendlymatthew merged 1 commit into
mainfrom
bit-iter

Conversation

@friendlymatthew
Copy link
Copy Markdown
Owner

This PR eliminates the intermediate Vec<u32> position arrays by iterating comma and newline bitsets directly during field extraction

extract_positions took 20% of decode time. Previously, decode extracted every set bit from the comma and newline bitsets into Vec<u32> position arrays, then extract_from_cache() indexed into these arrays

Profiling showed Vec::push inside extract_positions as a hot spot due to the per-element capacity checks + writes across positions.

Now, this PR adds a bitset iterator stored directly in the decoder. When fields are extracted, bitsets are walked on the fly using trailing_zeros and bit clear, computing positions from the u64 itself

Screenshot 2026-03-26 at 7 45 57 PM Screenshot 2026-03-26 at 7 45 31 PM

@friendlymatthew friendlymatthew merged commit 8d90a39 into main Mar 26, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant