-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[DOC] Add notebook to show using query with id filter #4521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
" min_price, max_price = price_ranges[category_id]\n", | ||
" \n", | ||
" for i, product_name in enumerate(product_names[category_id]):\n", | ||
" description = descriptions[category_id][i] if i < len(descriptions[category_id]) else \"Product description not available.\"\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[CriticalError]
There's a mismatch between product descriptions and their actual associations in the demo output. For example, in scenario 2 results, the CloudRest Memory Foam Mattress shows description about denim jeans, and CrispWave Air Fryer shows description about t-shirts. This happens because the setup logic assigns descriptions from one category to products from another.
In the setup_databases
function, when fetching descriptions for each product, try ensuring that descriptions match the correct category and product index:
Add Example Notebook: Using Query with ID Filter in Chroma This PR introduces a new Python Jupyter notebook that demonstrates how to use Chroma's query functionality with an ID filter to reduce search space, using a legal cases dataset as an illustrative example. The notebook walks through creating and populating SQLite tables, ingesting text data into Chroma, and querying using filtered IDs derived from structured SQL queries, thereby showcasing best practices for combining structured and semantic search. Key Changes: Affected Areas: This summary was automatically generated by @propel-code-bot |
234a249
to
5f8d886
Compare
5f8d886
to
e5da83c
Compare
Description of changes
This PR adds a python notebook showing how to use the id filter in query to help in decreasing the search space. The notebook goes through an ecommerce example, where the user has categories and products stored in a sqlite3 db, with only the text data stored in chroma. It then uses normal sql queries to find the subset of IDs it wants to search across semantically, and passes that to chroma.
Test plan
How are these changes tested?
pytest
for python,yarn test
for js,cargo test
for rustDocumentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?