Skip to content

Conversation

@cevian
Copy link
Contributor

@cevian cevian commented Jan 6, 2026

Summary

  • Add pgvector-semantic-search skill: comprehensive guide for pgvector setup, HNSW indexes, quantization strategies, filtering, and troubleshooting
  • Add hybrid-text-search skill: combining pg_textsearch (BM25) with pgvector using RRF fusion, with Python and TypeScript examples

Skills Added

pgvector-semantic-search (323 lines)

  • Golden path defaults (halfvec, HNSW, cosine distance)
  • HNSW and IVFFlat index configuration
  • Binary quantization for large datasets
  • Filtering best practices (iterative scan, partial indexes, partitioning)
  • Monitoring, debugging, and common issues

hybrid-text-search (241 lines)

  • When to use hybrid vs semantic-only vs keyword-only
  • Parallel query pattern with client-side RRF fusion
  • Weighting and reranking with cross-encoders
  • Python and TypeScript code examples
  • BM25 configuration notes (k1/b tuning, partitioned tables)

🤖 Generated with Claude Code

cevian and others added 12 commits January 6, 2026 12:58
Comprehensive guide covering:
- Golden path default setup with halfvec and HNSW
- Type rules for explicit casting and avoiding implicit-cast failures
- Binary quantization with stored generated columns for large scale
- Filtering best practices including iterative scan
- Performance guidance by dataset size
- Common issues and fixes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add intro paragraph explaining semantic search concepts
- Rename title to "pgvector for Semantic Search"
- Add core rule about using same embedding model for data and queries
- Improve query comment in standard pattern example

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Replace verbose Performance by Dataset Size section with concise table
- Remove redundant Large Scale Setup example (covered in Binary Quantization)
- Remove redundant Multi-Model Embeddings example
- Remove Examples section (covered by Standard Pattern)
- Net reduction of ~150 lines while preserving all key guidance

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add ef_search tuning table with recall % and relative speed
- Clarify ef_search must be >= LIMIT in HNSW Parameters
- Add comment explaining ef_search for binary quantization re-ranking
- Add scope note: guide covers pgvector setup, not model selection/chunking
- Simplify Performance table (remove ef_search, not scale-dependent)
- Standardize examples to use ef_search=100 (match Golden Path)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Remove outdated "half precision" comment in HNSW section
- Improve Bulk Loading: show text + binary options, add maintenance_work_mem
- Fix line break mid-sentence in Filtering section
- Add Maintenance section covering VACUUM and REINDEX
- Clarify RAM capacity table assumes RAM available for caching

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Add pgvector 0.8.0+ version requirement for halfvec, binary_quantize, iterative scan
- Add NOT NULL constraint to embedding column in Standard Pattern
- Add sql language tag to IVFFlat code block
- Remove extra blank line before Binary Quantization section
- Change binary quantization example from ALTER TABLE to CREATE TABLE with generated column

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Remove specific recall percentages that vary by dataset
- Change ef_search "must be >= LIMIT" to "should be"
- Use qualitative terms (lower/higher/very-high) instead of exact %
- Remove "~2x faster" claim for binary COPY format
- Fix grammar: "vectors are" → "vectors is"
- Clarify filter selectivity: "under ~10k rows"
- Simplify VACUUM description

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Rename skill to pgvector-semantic-search
- Add explanation of 80x oversampling ratio for binary quantization re-ranking
- Update partial index example category_id

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Combines pg_textsearch (BM25) with pgvector for hybrid search using RRF fusion:
- Parallel queries from client for lower latency
- Client-side RRF fusion with Python and TypeScript examples
- Weighting and reranking with cross-encoders
- BM25 notes: negative scores, k1/b tuning, partitioned tables caveat

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- pgvector → pgvector-semantic-search
- hybrid-search → hybrid-text-search

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Clarify SQL queries are separate (fix parameter confusion)
- Add Data Preparation section explaining chunking
- Add pg_textsearch prerelease note
- Remove duplicated SQL in reranking section
- Update TypeScript reranking to use Cohere SDK
- Add Monitoring & Debugging section with EXPLAIN examples

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Expand pg_textsearch description (self-managed, PG 17/18, production warning)
- Move TypeScript import to top of code block
- Add enable_seqscan = off for debugging to force index usage
- Use literal vector in EXPLAIN example for consistency

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@CLAassistant
Copy link

CLAassistant commented Jan 6, 2026

CLA assistant check
All committers have signed the CLA.

Copy link

@tjgreen42 tjgreen42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick read and the whole thing looks pretty nice. Couple of questions:

  • How do you test / eval these skills?
  • This skill is very long because it really covers three topics: keyword search, vector search, and fusion. Is it possible to delegate to sub-skills to better organize things? (So: keyword search sub-skill, vector search sub-skill, then hybrid search as this skill)


### BM25 Notes

- **Negative scores**: The `<@>` operator returns negative values where lower = better match. RRF uses rank position, so this doesn't affect fusion.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As does <-> for vector search

CREATE TABLE items (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
contents TEXT NOT NULL,
embedding halfvec(1536) NOT NULL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOT NULL on the embedding only works if embeddings are generated prior to insert rather than async. Wonder if it's worth noting

Default to HNSW. Use IVFFlat only when HNSW’s operational costs matter more than peak recall.

Choose IVFFlat if:
- Write-heavy or constantly changing data (frequent inserts, backfills, or re-embeds)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only works for write-heavy/constantly changing data IIF you are willing to rebuild the index frequently, right?

FROM (
SELECT i.id, i.contents, i.embedding
FROM items i, q
ORDER BY i.embedding_bq <~> q.qb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ORDER BY i.embedding_bq <~> q.qb
ORDER BY i.embedding_bq <~> q.qb -- uses index

ORDER BY i.embedding_bq <~> q.qb
LIMIT 800
) candidates
ORDER BY candidates.embedding <=> $1::halfvec(1536)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ORDER BY candidates.embedding <=> $1::halfvec(1536)
ORDER BY candidates.embedding <=> $1::halfvec(1536) -- computes actual dist; not index


## Core Rules

- **Enable the extension** in each database: `CREATE EXTENSION vector;`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Enable the extension** in each database: `CREATE EXTENSION vector;`
- **Enable the extension** in each database: `CREATE EXTENSION IF NOT EXISTS vector;`

### BM25 Notes

- **Negative scores**: The `<@>` operator returns negative values where lower = better match. RRF uses rank position, so this doesn't affect fusion.
- **Language config**: Change `text_config` to match your content language (e.g., `'french'`, `'german'`). See PostgreSQL text search configurations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link to the postgres docs for text search config?


Hybrid search combines keyword search (BM25) with semantic search (vector embeddings) to get the best of both: exact keyword matching and meaning-based retrieval. Use Reciprocal Rank Fusion (RRF) to merge results from both methods into a single ranked list.

This guide covers combining [pg_textsearch](https://github.com/timescale/pg_textsearch) (BM25) with [pgvector](https://github.com/pgvector/pgvector). Requires both extensions. For high-volume setups, filtering, or advanced pgvector tuning (binary quantization, HNSW parameters), see the **pgvector-semantic-search** skill.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we're using pgvector vs. pgvectorscale?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants