-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[ENH] For local: use subquery for FTS, unions for int & float metadata expr, is true -> is not null #4556
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
…true -> is not null
e2795bc
to
bdf96ce
Compare
Performance Improvements for Local Chroma Queries This PR implements several query optimization techniques for local Chroma databases. It moves full-text search (FTS) to a subquery approach, implements union queries for int and float metadata to better utilize indexes, and replaces 'is true' conditions with 'is not null' to take advantage of database indexing. These changes result in significant performance improvements as shown in the before/after benchmarks provided in the PR description. Key Changes: Affected Areas: This summary was automatically generated by @propel-code-bot |
@@ -778,6 +858,8 @@ impl SqliteMetadataReader { | |||
|
|||
#[cfg(test)] | |||
mod tests { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest adding a few coverage tests for common edge cases
- Metadata key null for one record, exists for another
- FTS match but no metadata match
etc
Description of changes
Move FTS to subquery approach for improved performance, and use unions in the case of int and float metadata, and replace is true uses with is not null to utilize db indexes


before (0.6.3):
after:


Test plan
How are these changes tested?
pytest
for python,yarn test
for js,cargo test
for rustDocumentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?