Releases: SpillwaveSolutions/agent-brain
v6.0.0 — PostgreSQL Backend
v6.0.0 — PostgreSQL Backend
Agent Brain now supports PostgreSQL as a configurable storage backend alongside ChromaDB, with pgvector for vector similarity search and tsvector for full-text search.
Highlights
- Dual-backend architecture: Choose between ChromaDB (default, zero-config) or PostgreSQL (production-scale) via a single YAML config change
- pgvector integration: HNSW-indexed vector search with cosine, L2, and inner product distance metrics
- tsvector full-text search: GIN-indexed keyword search with weighted relevance (title > summary > content) and configurable language stemming
- Hybrid retrieval: RRF fusion combines vector + keyword results consistently across both backends
- Async connection pooling: SQLAlchemy async engine with configurable pool sizing, overflow, and retry with exponential backoff
- Health monitoring:
/health/postgresendpoint with live pool metrics (size, checked-in, checked-out, overflow)
What's New
Storage Abstraction Layer
StorageBackendProtocol— 11-method async protocol for backend-agnostic servicesStorageBackendFactory— selects backend from env var > YAML config > default ("chroma")ChromaBackend— wraps existing vector_store/bm25_manager with full backward compatibility
PostgreSQL Backend (storage.postgres)
PostgresConfig— Pydantic config: host, port, pool sizing, HNSW params, languagePostgresConnectionManager— async SQLAlchemy engine with retry and pool metricsPostgresSchemaManager— auto-creates tables + HNSW/GIN indexes on startupVectorOps— pgvector search with 0-1 score normalizationKeywordOps— tsvector FTS withwebsearch_to_tsqueryand weighted relevancePostgresBackend— full protocol implementation with RRF hybrid fusion
Plugin & Documentation
/agent-brain-configcommand for guided backend selection and YAML generation/agent-brain-setupwith Docker detection anddocker-compose.postgres.ymlstartup- Setup assistant with PostgreSQL error pattern recognition
- Docker Compose template for
pgvector/pgvector:pg16 - Three new docs: PostgreSQL Setup, Configuration Reference, Performance Tradeoffs
CI Integration
pgvector/pgvector:pg16service container in GitHub Actions@pytest.mark.postgresmarker with graceful skip when no DB available- Contract tests validate identical behavior across backends
By the Numbers
| Metric | Value |
|---|---|
| Production code | ~3,200 lines across 12+ files |
| Tests | 675 passing (153 PostgreSQL-specific) |
| Server coverage | 73% |
| CLI coverage | 54% |
| Requirements | 34/34 implemented |
| Phases | 6 phases (5-10), all verified + UAT-approved |
Installation
# Server (with PostgreSQL support)
pip install agent-brain-rag[postgres]==6.0.0
# CLI
pip install agent-brain-cli==6.0.0Quick Start with PostgreSQL
# 1. Start PostgreSQL with pgvector
docker compose -f docker-compose.postgres.yml up -d
# 2. Configure backend
export AGENT_BRAIN_STORAGE_BACKEND=postgres
export DATABASE_URL=postgresql+asyncpg://agent_brain:agent_brain@localhost:5432/agent_brain
# 3. Start server
agent-brain-serveSee the PostgreSQL Setup Guide for detailed instructions.
Breaking Changes
None. ChromaDB remains the default backend. All existing configurations continue to work unchanged.
Full Changelog: v5.0.0...v6.0.0
v5.0.0
Agent Brain v5.0.0 — Advanced RAG Pipeline
This major release delivers a complete Advanced RAG pipeline with two-stage reranking, pluggable embedding/summarization providers, schema-based GraphRAG, provider integration testing, and a storage abstraction layer. 169 files changed across 3 merged PRs with 559 tests passing at 70% coverage.
Highlights
- Two-Stage Reranking — Optional second-pass scoring with SentenceTransformer CrossEncoder or Ollama LLM reranker, with graceful fallback to stage-1 results on failure
- Pluggable Providers — Configuration-driven model selection for embeddings and summarization with dimension mismatch prevention, strict startup validation, and CLI config management
- Schema-Based GraphRAG — 17 entity types across Code/Doc/Infra categories with 8 relationship predicates and type-filtered graph queries
- Provider Integration Testing — Per-provider E2E test suites (OpenAI, Anthropic, Cohere, Ollama) with dedicated CI workflow
- Storage Backend Abstraction —
StorageBackendProtocol(PEP 544) decoupling services from ChromaDB, enabling future PostgreSQL/pgvector backend
Installation
# Standard installation
pip install agent-brain-rag==5.0.0 agent-brain-cli==5.0.0
# With GraphRAG support
pip install "agent-brain-rag[graphrag]==5.0.0" agent-brain-cli==5.0.0
# With Kuzu backend (production)
pip install "agent-brain-rag[graphrag-kuzu]==5.0.0" agent-brain-cli==5.0.0What's New
1. Two-Stage Reranking (PR #108)
Adds an optional second-pass reranking stage to improve search precision. Stage 1 retrieves a broad set of candidates; Stage 2 re-scores them with a cross-encoder or LLM for better relevance ordering.
Providers:
- SentenceTransformerRerankerProvider — CrossEncoder-based (
cross-encoder/ms-marco-MiniLM-L-6-v2), with lazy model loading and warm-up support - OllamaRerankerProvider — LLM-based relevance scoring with circuit breaker pattern (3 failures → 60s cooldown)
Configuration:
export ENABLE_RERANKING=true # Off by default
export RERANKER_PROVIDER=sentence-transformers # or "ollama"
export RERANKER_MODEL=cross-encoder/ms-marco-MiniLM-L-6-v2
export RERANKER_TOP_K_MULTIPLIER=10 # Stage 1 retrieves top_k * this
export RERANKER_MAX_CANDIDATES=100 # Cap on Stage 1 candidatesDesign:
- Reranking is strictly optional — disabled by default
- Falls back to stage-1 results on any reranker failure
- Empty results from reranker trigger automatic fallback
2. Pluggable Providers (PR #109 — Phase 2)
Configuration-driven model selection lets you swap embedding and summarization providers without code changes. Includes safety features to prevent dimension mismatches when switching providers on existing indexes.
Features:
- Embedding metadata storage in ChromaDB collection metadata
- Dual-layer validation: startup warning + indexing error (unless
--force) /health/providersendpoint — inspect active provider configuration--strictflag /AGENT_BRAIN_STRICT_MODEenv var — fail on critical validation errors- CLI config commands:
agent-brain config showandagent-brain config path - Provider switching E2E tests with YAML config fixtures
- Ollama offline E2E tests
Usage:
# View current provider configuration
agent-brain config show
# Start in strict mode (fail on validation errors)
agent-brain start --strict
# Force indexing with mismatched provider
agent-brain index /path/to/docs --force3. Schema-Based GraphRAG (PR #109 — Phase 3)
Adds a domain-specific type system to GraphRAG, enabling structured entity extraction and type-filtered graph queries.
Entity Types (17 types across 3 categories):
| Category | Types |
|---|---|
| Code (7) | Package, Module, Class, Method, Function, Interface, Enum |
| Documentation (6) | DesignDoc, UserDoc, PRD, Runbook, README, APIDoc |
| Infrastructure (4) | Service, Endpoint, Database, ConfigFile |
Relationship Predicates (8):
calls, extends, implements, references, depends_on, imports, contains, defined_in
Type-Filtered Queries:
# Query with entity type filter via API
curl -X POST http://localhost:8000/query/ \
-H "Content-Type: application/json" \
-d '{
"query": "authentication services",
"mode": "graph",
"entity_types": ["Class", "Service"],
"relationship_types": ["calls", "implements"]
}'Features:
normalize_entity_type()with acronym preservation (README, APIDoc, PRD)- Schema-guided LLM extraction prompts organized by category
GraphIndexManager.query_by_type()for filtered graph traversalQueryRequest.entity_typesandQueryRequest.relationship_typesAPI fields
4. Provider Integration Testing (PR #109 — Phase 4)
Dedicated E2E test suites for each provider with a GitHub Actions CI workflow that runs against real APIs.
Test Suites:
test_provider_openai.py— OpenAI embedding and completion teststest_provider_cohere.py— Cohere provider teststest_provider_ollama.py— Ollama offline/availability teststest_provider_switching.py— Provider switching with validation
CI:
provider-e2e.ymlworkflow triggered on pushes with API key changesci-testingenvironment with API key secrets
5. Storage Backend Abstraction (PR #110)
Decouples all storage operations from ChromaDB through a StorageBackendProtocol (PEP 544), enabling alternative backends like PostgreSQL/pgvector.
Architecture:
StorageBackendProtocol— async-first Protocol with 11 methods (vector search, keyword search, upsert, reset, embedding metadata, etc.)ChromaBackend— adapter wrapping existing VectorStoreManager + BM25IndexManager behind the protocol- Backend factory — config-driven selection:
AGENT_BRAIN_STORAGE_BACKENDenv var > YAMLstorage.backend> defaultchroma - Service refactor —
QueryServiceandIndexingServicenow depend onStorageBackendProtocol, not concrete classes - BM25 score normalization — per-query max normalization to 0-1 range for consistent hybrid search fusion
Configuration:
# Select storage backend (default: chroma)
export AGENT_BRAIN_STORAGE_BACKEND=chroma
# Or via YAML config
# storage:
# backend: chromaConfiguration Reference
New Environment Variables in v5.0.0
| Variable | Default | Description |
|---|---|---|
ENABLE_RERANKING |
false |
Enable two-stage reranking |
RERANKER_PROVIDER |
sentence-transformers |
Reranker backend (sentence-transformers or ollama) |
RERANKER_MODEL |
cross-encoder/ms-marco-MiniLM-L-6-v2 |
Reranker model |
RERANKER_TOP_K_MULTIPLIER |
10 |
Stage 1 over-retrieval factor |
RERANKER_MAX_CANDIDATES |
100 |
Max Stage 1 candidates |
AGENT_BRAIN_STRICT_MODE |
false |
Fail on critical validation errors |
AGENT_BRAIN_STORAGE_BACKEND |
chroma |
Storage backend selection |
Upgrade Notes
- Non-breaking: All new features are opt-in. Existing installations continue working without changes.
- Reranking: Disabled by default. Enable with
ENABLE_RERANKING=true. - Provider validation: Warns on startup if embedding dimensions mismatch. Use
--strictto make warnings fatal. - Storage backend: Defaults to ChromaDB. The abstraction layer is transparent to existing users.
- Re-indexing: To use schema-based GraphRAG type filtering, re-index with
agent-brain index /path --rebuild.
Stats
- 169 files changed, 45,064 insertions, 1,507 deletions
- 559 tests passing, 70% coverage
- 3 PRs merged: #108, #109, #110
- 5 phases executed with 100% must-have verification
Full Changelog
Features:
- feat: two-stage reranking with SentenceTransformer and Ollama providers
- feat: pluggable embedding/summarization providers with config-driven selection
- feat: embedding metadata storage and dimension mismatch prevention
- feat: strict startup validation mode (
--strict/AGENT_BRAIN_STRICT_MODE) - feat:
/health/providersendpoint - feat:
agent-brain config show/pathCLI commands - feat: schema-based GraphRAG with 17 entity types and 8 relationship predicates
- feat:
query_by_type()graph query with entity/relationship filtering - feat:
QueryRequest.entity_typesandQueryRequest.relationship_typesAPI fields - feat:
StorageBackendProtocol(PEP 544) with 11 async methods - feat:
ChromaBackendadapter implementing the storage protocol - feat: backend factory with env var > YAML > default config precedence
- feat: per-provider E2E test suites (OpenAI, Cohere, Ollama)
- feat: GitHub Actions provider E2E workflow
Fixes:
- fix: BM25 score normalization for consistent hybrid search fusion
- fix: lazy-import LanguageDetector to break circular import on CI
- fix: reranker graceful fallback on empty results
CI/CD:
- ci:
provider-e2e.ymlworkflow withci-testingenvironment - ci: pip and HuggingFace caching for faster builds
- ci: commit poetry.lock files for reproducible builds
PyPI Packages
v4.0.0 - GraphRAG Integration
Agent Brain v4.0.0 - GraphRAG Integration
This major release introduces GraphRAG (Knowledge Graph-based Retrieval Augmented Generation), enabling relationship-aware search for code dependencies, entity connections, and architectural exploration.
Highlights
- GraphRAG Search Mode: Query code relationships, dependencies, and entity connections
- Multi-Mode Fusion: Combine vector + BM25 + graph search with Reciprocal Rank Fusion (RRF)
- Entity & Relationship Extraction: Automatically extract functions, classes, modules and their relationships (calls, imports, inherits)
- Pluggable Graph Stores: Simple (default) or Kuzu (production) backends
Using GraphRAG
Installation
# Standard installation
pip install agent-brain-rag agent-brain-cli
# With GraphRAG support (LLM extraction)
pip install "agent-brain-rag[graphrag]" agent-brain-cli
# With Kuzu backend (production)
pip install "agent-brain-rag[graphrag-kuzu]" agent-brain-cliEnable GraphRAG
Set environment variables before starting the server:
# Required: master switch
export ENABLE_GRAPH_INDEX=true
# Optional configuration
export GRAPH_STORE_TYPE=simple # or kuzu
export GRAPH_INDEX_PATH=./graph_index
export GRAPH_USE_CODE_METADATA=true # Extract from AST metadata
export GRAPH_USE_LLM_EXTRACTION=true # Use LLM extractor
export GRAPH_TRAVERSAL_DEPTH=2 # Query depth
# Start server and index
agent-brain start
agent-brain index /path/to/code --rebuildVerify GraphRAG is Enabled
agent-brain status
# Look for: Graph Index: enabled (X entities, Y relationships)
# Or with JSON output
agent-brain status --json | jq '.graph_index'CLI Usage
Graph Search (Relationship Queries)
# What calls a specific function
agent-brain query "what functions call process_payment" --mode graph
# Class inheritance
agent-brain query "classes that inherit from BaseService" --mode graph
# Module dependencies
agent-brain query "modules that import authentication" --mode graph
# More results with lower threshold
agent-brain query "dependencies of UserController" --mode graph --top-k 10 --threshold 0.2Multi-Mode Search (Comprehensive)
Combines vector + BM25 + graph search using Reciprocal Rank Fusion:
agent-brain query "complete authentication implementation" --mode multi --top-k 10Claude Code Plugin Commands
/agent-brain-graph <query> # Graph-only search
/agent-brain-multi <query> # Multi-mode fusion search
/agent-brain-search <query> # Default hybrid search
API Usage
Graph Search Endpoint
curl -X POST http://localhost:8000/query/ \
-H "Content-Type: application/json" \
-d '{
"query": "what functions call authenticate_user",
"mode": "graph",
"top_k": 10,
"similarity_threshold": 0.3
}'Multi-Mode Search
curl -X POST http://localhost:8000/query/ \
-H "Content-Type: application/json" \
-d '{
"query": "complete authentication implementation",
"mode": "multi",
"top_k": 10,
"similarity_threshold": 0.3
}'Search Mode Comparison
| Mode | Speed | Best For | Example Query |
|---|---|---|---|
bm25 |
Fast (10-50ms) | Technical terms, exact matches | "AuthenticationError" |
vector |
Slower (800-1500ms) | Concepts, natural language | "how authentication works" |
hybrid |
Slower (1000-1800ms) | Balanced results | "OAuth implementation guide" |
graph |
Medium (500-1200ms) | Relationships, dependencies | "what calls AuthService" |
multi |
Slowest (1500-2500ms) | Most comprehensive | "complete auth flow" |
When to Use Each Mode
| Query Type | Recommended Mode |
|---|---|
| "what calls X" | graph |
| "dependencies of X" | graph |
| "classes that inherit from X" | graph |
| "how does X work" | vector or hybrid |
| "find error message X" | bm25 |
| "complete implementation of X" | multi |
Configuration Reference
Environment Variables
| Variable | Default | Description |
|---|---|---|
ENABLE_GRAPH_INDEX |
false |
Master switch for GraphRAG |
GRAPH_STORE_TYPE |
simple |
Graph backend: simple or kuzu |
GRAPH_INDEX_PATH |
./graph_index |
Graph storage path |
GRAPH_USE_CODE_METADATA |
true |
Extract from AST |
GRAPH_USE_LLM_EXTRACTION |
true |
Use LLM extractor |
GRAPH_MAX_TRIPLETS_PER_CHUNK |
10 |
Max triplets per chunk |
GRAPH_TRAVERSAL_DEPTH |
2 |
Query traversal depth |
GRAPH_RRF_K |
60 |
RRF constant for multi mode |
GRAPH_EXTRACTION_MODEL |
claude-haiku-4-5 |
LLM for extraction |
Upgrade Notes
- Breaking Change: Version 4.0.0 introduces new query modes (
graph,multi) that require GraphRAG to be enabled - Migration: Existing installations work without changes; GraphRAG is opt-in via
ENABLE_GRAPH_INDEX=true - Re-indexing: To use GraphRAG, re-index documents after enabling:
agent-brain index /path --rebuild
Full Changelog
Features:
- Add GraphRAG integration with entity and relationship extraction
- Add
graphsearch mode for dependency and relationship queries - Add
multisearch mode combining vector + BM25 + graph via RRF - Add pluggable graph stores (simple, kuzu)
- Add
/agent-brain-graphand/agent-brain-multiplugin commands - Add GraphRAG configuration documentation
Documentation:
- Add comprehensive GraphRAG usage guide
- Add graph search reference documentation
- Add GraphRAG verification checklist to configuring skill
- Update CLAUDE.md with pre-push requirements
Full Documentation: https://github.com/SpillwaveSolutions/agent-brain/wiki
PyPI Packages (available in ~5 minutes):
v3.0.0 - Server-Side Job Queue
What's Changed
Features
-
Server-Side Job Queue: Async indexing with JSONL persistence and file locking (#105)
- JobQueueStore with JSONL persistence
- JobWorker background processor with timeout handling
- JobQueueService with deduplication and backpressure
- New endpoints:
GET /index/jobs/,GET /index/jobs/{id},DELETE /index/jobs/{id}
-
CLI Jobs Command: Full job management from CLI
agent-brain jobs- List all jobs in queueagent-brain jobs --watch- Watch queue with live Rich table updatesagent-brain jobs JOB_ID- Show detailed job informationagent-brain jobs JOB_ID --cancel- Cancel a pending or running job
-
Runtime Autodiscovery: CLI automatically finds server URL
- CLI reads
.claude/agent-brain/runtime.jsonfor server URL - Foreground mode now writes runtime.json before exec
- Resolution order:
AGENT_BRAIN_URL> runtime.json > config.yaml > default:8000
- CLI reads
-
Local Integration Check: E2E validation script
scripts/local_integration_check.shfor pre-release validation- Added
task local-integrationto Taskfile
Breaking Changes
POST /indexnow returns 202 Accepted withjob_id(was blocking)POST /index/addnow returns 202 Accepted withjob_id- Response includes
queue_position,queue_length,dedupe_hit - Removed:
--daemonflag (server backgrounds by default)
Bug Fixes
- fix: remove "OPENAI required" messages, mention Ollama as free option
- fix: match plugin name in marketplace.json with plugin.json
- fix: update configuring-agent-brain skill to show Ollama first
- fix: put Ollama first in all provider selection prompts
- fix: correct marketplace.json schema for Claude Code plugin installation
Migration Notes
For API Clients:
If your code waits for indexing completion synchronously, update to poll the job status:
# Before (v2.x)
response = requests.post(f"{url}/index", json={"folder_path": "/docs"})
# Blocking - returns when done
# After (v3.x)
response = requests.post(f"{url}/index", json={"folder_path": "/docs"})
job_id = response.json()["job_id"]
# Poll for completion
while True:
status = requests.get(f"{url}/index/jobs/{job_id}").json()
if status["status"] in ["done", "failed", "cancelled"]:
break
time.sleep(2)For CLI Users:
No changes required. The agent-brain index command works as before but now returns immediately with a job ID. Use agent-brain jobs --watch to monitor progress.
About Agent Brain
Agent Brain provides intelligent document indexing and semantic search for AI agents:
- Semantic Search: Natural language queries via OpenAI embeddings
- Keyword Search (BM25): Traditional keyword matching with TF-IDF
- GraphRAG: Knowledge graph retrieval for relationship-aware queries
- Hybrid Search: Best of vector + keyword approaches
- Pluggable Providers: Choose your embedding and summarization providers
PyPI Packages
- agent-brain-rag: https://pypi.org/project/agent-brain-rag/3.0.0/
- agent-brain-cli: https://pypi.org/project/agent-brain-cli/3.0.0/
Installation
pip install agent-brain-rag==3.0.0 agent-brain-cli==3.0.0
# With GraphRAG support
pip install agent-brain-rag[graphrag]==3.0.0Documentation
Full Changelog: v2.0.0...v3.0.0
v1.2.0 - Agent Brain Naming Unification
Agent Brain Naming Unification (Breaking Change)
This release completes the rebranding from 'doc-serve' to 'agent-brain' for the entire project.
New Package Names
- PyPI Server:
agent-brain-rag(wasdoc-serve-rag) - PyPI CLI:
agent-brain-cli(wasdoc-svr-ctl)
New Commands
- Server:
agent-brain-serve(replacesdoc-serve) - CLI:
agent-brain(replacesdoc-svr-ctl)
Migration Guide
The old commands (doc-serve, doc-svr-ctl) still work but will show deprecation warnings. See MIGRATION.md for full migration instructions.
Directory Structure Changes
doc-serve-server/→agent-brain-server/doc-svr-ctl/→agent-brain-cli/doc-serve-skill/→agent-brain-skill/
Install
# Server
pip install agent-brain-rag
# CLI
pip install agent-brain-cliFull Changelog
v1.1.0 - First PyPI Release
Doc-Serve v1.1.0 - First PyPI Release
This release marks the first publication of doc-serve packages to PyPI.
Installation
pip install agent-brain-rag agent-brain-cliPyPI Packages
| Package | PyPI | Description |
|---|---|---|
| agent-brain-rag | PyPI | RAG server |
| agent-brain-cli | PyPI | CLI management tool |
Note: Package names are
agent-brain-ragandagent-brain-clidue to PyPI naming constraints. CLI commands remaindoc-serveanddoc-svr-ctl.
What's New
Package Changes
- First PyPI release as
agent-brain-ragandagent-brain-cli - Version 1.1.0
New Features
- MIT LICENSE added
- PyPI metadata (homepage, repository, keywords, classifiers)
- pip installation support
Documentation
- READMEs updated with pip install instructions
- Skill documentation updated to use pip install
CLI Commands (Unchanged)
| Command | Description |
|---|---|
doc-serve |
Start the RAG server |
doc-svr-ctl init |
Initialize project for doc-serve |
doc-svr-ctl start |
Start server for current project |
doc-svr-ctl stop |
Stop running server |
doc-svr-ctl status |
Check server status |
doc-svr-ctl query |
Search indexed documents |
doc-svr-ctl index |
Index documents |
doc-svr-ctl list |
List running instances |
Quick Start
# Install
pip install agent-brain-rag agent-brain-cli
# Initialize and start
doc-svr-ctl init
doc-svr-ctl start --daemon
# Index and search
doc-svr-ctl index /path/to/docs
doc-svr-ctl query "search term" --mode hybrid
# Stop when done
doc-svr-ctl stopRequirements
- Python 3.10+
- OpenAI API key (for embeddings)
- Anthropic API key (optional, for summarization)
1.0.0
First version.
Supports Vector index rag.
Supports command line tools.
Has Skill that refers to command line skills to index docs.
What's Changed
- Feat/phase1 finalization and qa gate by @RichardHightower in #1
New Contributors
- @RichardHightower made their first contribution in #1
Full Changelog: https://github.com/SpillwaveSolutions/doc-serve-skill/commits/v1.0.0