[pull] trunk from spiceai:trunk#856
Merged
Merged
Conversation
* perf(cayenne): reduce allocation overheads in hot paths * refactor(cayenne): optimize SQL statement construction for insert and delete operations * perf(cayenne): optimize hashmap usage for insert-records and improve statistics retrieval * perf(cayenne): algorithmic wins across compaction picker, deletion writer, scan, and metastore - compaction_picker_pick_candidates: replaced full O(N log N) sort with O(N) select_nth_unstable_by_key + size_hint-based bucket pre-sizing. Measured 2.1-3.6x speedup at 100/1000/10000 file counts. - position_delete writer: added pre_sorted flag + new_position_based_sorted constructor so RoaringBitmap-derived row_ids skip the writer's redundant sort+dedup. UInt64Array::from_iter_values replaces row_ids.to_vec() for one less full O(K+N) copy per commit. - deletion filter exec (Int64 + KeyBased): per-batch keep_mask now uses BooleanBufferBuilder (1 bit/row packed) instead of Vec<bool> (1 byte/row) + skip the BooleanArray repack pass. - protected_snapshots field: Arc<RwLock<HashMap>> -> Arc<ArcSwap<HashMap>>. Scan-side reads are now wait-free Arc::clone with no HashMap clone; writes use rcu() for atomic CoW publish. Touches 12 sites across provider/table.rs + provider/delete/sink/file_based.rs. - DeletionIndex / KeyDeletionIndex extend_max: pre-size new_keys/new_hashes from iterator size_hint. Measured 1.04-1.09x on small-batch CDC ingest. - scan_file_for_key_matches (position-based deletion): cache key_indices on first chunk instead of re-resolving per chunk (large files paid index_of K times per chunk). - metastore conversion (sqlite + turso): convert_*_value and to_*_value now consume MetastoreValue/TursoValue instead of borrowing, eliminating per-param + per-row String/Vec<u8> clone in execute/query hot paths. All wins independently verified by cargo nextest -p cayenne --lib (273/273 passing through every iteration). * Improve --------- Co-authored-by: Sergei Grebnov <sergei.grebnov@gmail.com>
Co-authored-by: Jeadie <jeadie@users.noreply.github.com> Co-authored-by: Sergei Grebnov <sergei.grebnov@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )