-
Notifications
You must be signed in to change notification settings - Fork 68
Add pgvector and hybrid text search skills #65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Comprehensive guide covering: - Golden path default setup with halfvec and HNSW - Type rules for explicit casting and avoiding implicit-cast failures - Binary quantization with stored generated columns for large scale - Filtering best practices including iterative scan - Performance guidance by dataset size - Common issues and fixes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add intro paragraph explaining semantic search concepts - Rename title to "pgvector for Semantic Search" - Add core rule about using same embedding model for data and queries - Improve query comment in standard pattern example 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Replace verbose Performance by Dataset Size section with concise table - Remove redundant Large Scale Setup example (covered in Binary Quantization) - Remove redundant Multi-Model Embeddings example - Remove Examples section (covered by Standard Pattern) - Net reduction of ~150 lines while preserving all key guidance 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add ef_search tuning table with recall % and relative speed - Clarify ef_search must be >= LIMIT in HNSW Parameters - Add comment explaining ef_search for binary quantization re-ranking - Add scope note: guide covers pgvector setup, not model selection/chunking - Simplify Performance table (remove ef_search, not scale-dependent) - Standardize examples to use ef_search=100 (match Golden Path) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Remove outdated "half precision" comment in HNSW section - Improve Bulk Loading: show text + binary options, add maintenance_work_mem - Fix line break mid-sentence in Filtering section - Add Maintenance section covering VACUUM and REINDEX - Clarify RAM capacity table assumes RAM available for caching 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add pgvector 0.8.0+ version requirement for halfvec, binary_quantize, iterative scan - Add NOT NULL constraint to embedding column in Standard Pattern - Add sql language tag to IVFFlat code block - Remove extra blank line before Binary Quantization section - Change binary quantization example from ALTER TABLE to CREATE TABLE with generated column 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Remove specific recall percentages that vary by dataset - Change ef_search "must be >= LIMIT" to "should be" - Use qualitative terms (lower/higher/very-high) instead of exact % - Remove "~2x faster" claim for binary COPY format - Fix grammar: "vectors are" → "vectors is" - Clarify filter selectivity: "under ~10k rows" - Simplify VACUUM description 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Rename skill to pgvector-semantic-search - Add explanation of 80x oversampling ratio for binary quantization re-ranking - Update partial index example category_id 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Combines pg_textsearch (BM25) with pgvector for hybrid search using RRF fusion: - Parallel queries from client for lower latency - Client-side RRF fusion with Python and TypeScript examples - Weighting and reranking with cross-encoders - BM25 notes: negative scores, k1/b tuning, partitioned tables caveat 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- pgvector → pgvector-semantic-search - hybrid-search → hybrid-text-search 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Clarify SQL queries are separate (fix parameter confusion) - Add Data Preparation section explaining chunking - Add pg_textsearch prerelease note - Remove duplicated SQL in reranking section - Update TypeScript reranking to use Cohere SDK - Add Monitoring & Debugging section with EXPLAIN examples 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Expand pg_textsearch description (self-managed, PG 17/18, production warning) - Move TypeScript import to top of code block - Add enable_seqscan = off for debugging to force index usage - Use literal vector in EXPLAIN example for consistency 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
tjgreen42
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a quick read and the whole thing looks pretty nice. Couple of questions:
- How do you test / eval these skills?
- This skill is very long because it really covers three topics: keyword search, vector search, and fusion. Is it possible to delegate to sub-skills to better organize things? (So: keyword search sub-skill, vector search sub-skill, then hybrid search as this skill)
|
|
||
| ### BM25 Notes | ||
|
|
||
| - **Negative scores**: The `<@>` operator returns negative values where lower = better match. RRF uses rank position, so this doesn't affect fusion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As does <-> for vector search
| CREATE TABLE items ( | ||
| id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, | ||
| contents TEXT NOT NULL, | ||
| embedding halfvec(1536) NOT NULL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOT NULL on the embedding only works if embeddings are generated prior to insert rather than async. Wonder if it's worth noting
| Default to HNSW. Use IVFFlat only when HNSW’s operational costs matter more than peak recall. | ||
|
|
||
| Choose IVFFlat if: | ||
| - Write-heavy or constantly changing data (frequent inserts, backfills, or re-embeds) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It only works for write-heavy/constantly changing data IIF you are willing to rebuild the index frequently, right?
| FROM ( | ||
| SELECT i.id, i.contents, i.embedding | ||
| FROM items i, q | ||
| ORDER BY i.embedding_bq <~> q.qb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ORDER BY i.embedding_bq <~> q.qb | |
| ORDER BY i.embedding_bq <~> q.qb -- uses index |
| ORDER BY i.embedding_bq <~> q.qb | ||
| LIMIT 800 | ||
| ) candidates | ||
| ORDER BY candidates.embedding <=> $1::halfvec(1536) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ORDER BY candidates.embedding <=> $1::halfvec(1536) | |
| ORDER BY candidates.embedding <=> $1::halfvec(1536) -- computes actual dist; not index |
|
|
||
| ## Core Rules | ||
|
|
||
| - **Enable the extension** in each database: `CREATE EXTENSION vector;` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - **Enable the extension** in each database: `CREATE EXTENSION vector;` | |
| - **Enable the extension** in each database: `CREATE EXTENSION IF NOT EXISTS vector;` |
| ### BM25 Notes | ||
|
|
||
| - **Negative scores**: The `<@>` operator returns negative values where lower = better match. RRF uses rank position, so this doesn't affect fusion. | ||
| - **Language config**: Change `text_config` to match your content language (e.g., `'french'`, `'german'`). See PostgreSQL text search configurations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link to the postgres docs for text search config?
|
|
||
| Hybrid search combines keyword search (BM25) with semantic search (vector embeddings) to get the best of both: exact keyword matching and meaning-based retrieval. Use Reciprocal Rank Fusion (RRF) to merge results from both methods into a single ranked list. | ||
|
|
||
| This guide covers combining [pg_textsearch](https://github.com/timescale/pg_textsearch) (BM25) with [pgvector](https://github.com/pgvector/pgvector). Requires both extensions. For high-volume setups, filtering, or advanced pgvector tuning (binary quantization, HNSW parameters), see the **pgvector-semantic-search** skill. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we're using pgvector vs. pgvectorscale?
Summary
Skills Added
pgvector-semantic-search (323 lines)
hybrid-text-search (241 lines)
🤖 Generated with Claude Code