docs: complete v6.0 PostgreSQL Backend milestone

RichardHightower · claude · RichardHightower · commit 83b27be281bd · 2026-02-13T17:19:42.000-06:00
- Move 34 v6.0 requirements from Active to Validated in PROJECT.md
- Update context section with v6.0 stats (675 tests, 73% coverage)
- Add v6.0 key decisions (protocol abstraction, pgvector, async SQLAlchemy, RRF)
- Mark v6.0 milestone as shipped in ROADMAP.md
- All 10 phases (5-10) executed, verified, and UAT-approved

Co-Authored-By: Claude Opus 4.6 &lt;noreply@anthropic.com&gt;
diff --git a/.planning/PROJECT.md b/.planning/PROJECT.md
@@ -56,20 +56,44 @@ Agent Brain is a local-first RAG (Retrieval-Augmented Generation) service that i
 - ✓ **TEST-05**: Provider health check endpoint — v3.0 (Feature 124)
 - ✓ **TEST-06**: Verified provider configuration documentation — v3.0 (Feature 124)
 
-### Active
-
-**Current Milestone: v6.0 PostgreSQL Backend**
+- ✓ **STOR-01**: Storage abstraction protocol (11 methods) — v6.0
+- ✓ **STOR-02**: Backend factory with env/YAML/default selection — v6.0
+- ✓ **STOR-03**: ChromaDB backend wraps existing vector_store/bm25_manager — v6.0
+- ✓ **STOR-04**: Contract tests for backend protocol compliance — v6.0
+- ✓ **STOR-05**: Legacy parameter backward compatibility — v6.0
+- ✓ **PGVEC-01**: pgvector extension for vector similarity search — v6.0
+- ✓ **PGVEC-02**: Cosine, L2, inner product distance metrics — v6.0
+- ✓ **PGVEC-03**: HNSW index with configurable m/ef_construction — v6.0
+- ✓ **PGVEC-04**: Embedding dimension validation — v6.0
+- ✓ **PGFTS-01**: tsvector full-text search with GIN index — v6.0
+- ✓ **PGFTS-02**: Weighted relevance (A/B/C) for title/summary/content — v6.0
+- ✓ **PGFTS-03**: Configurable language for stemming — v6.0
+- ✓ **PGFTS-04**: RRF hybrid fusion for vector + keyword results — v6.0
+- ✓ **INFRA-01**: Docker Compose for pgvector:pg16 development setup — v6.0
+- ✓ **INFRA-02**: Async connection pooling with SQLAlchemy — v6.0
+- ✓ **INFRA-03**: `/health/postgres` endpoint with pool metrics — v6.0
+- ✓ **INFRA-04**: Auto schema initialization on backend startup — v6.0
+- ✓ **INFRA-05**: Poetry extras for optional PostgreSQL dependencies — v6.0
+- ✓ **CONF-01**: YAML storage.backend + storage.postgres configuration — v6.0
+- ✓ **CONF-02**: Connection params (host, port, pool size, HNSW) — v6.0
+- ✓ **CONF-03**: DATABASE_URL env var override — v6.0
+- ✓ **V6TEST-01**: Contract tests with pytest markers + skip-without-DB — v6.0
+- ✓ **V6TEST-02**: CI PostgreSQL service container in GitHub Actions — v6.0
+- ✓ **V6TEST-03**: Backend wiring smoke tests (mock-based) — v6.0
+- ✓ **V6TEST-04**: Service-level PostgreSQL E2E tests — v6.0
+- ✓ **PLUG-01**: `/agent-brain-config` command for backend selection — v6.0
+- ✓ **PLUG-02**: YAML generation for PostgreSQL config — v6.0
+- ✓ **PLUG-03**: `/agent-brain-setup` with Docker detection — v6.0
+- ✓ **PLUG-04**: PostgreSQL error pattern recognition in setup agent — v6.0
+- ✓ **PLUG-05**: docker-compose.postgres.yml template — v6.0
+- ✓ **PLUG-06**: Plugin version bump to v5.0.0 — v6.0
+- ✓ **DOCS-01**: PostgreSQL setup guide — v6.0
+- ✓ **DOCS-02**: Full configuration reference — v6.0
+- ✓ **DOCS-03**: ChromaDB vs PostgreSQL performance tradeoffs guide — v6.0
 
-**Goal:** Add PostgreSQL as a configurable storage backend with pgvector for vector search and tsvector for full-text search, running alongside ChromaDB as a dual-backend architecture.
+### Active
 
-**Target features:**
-- Storage backend abstraction layer (ChromaDB default, PostgreSQL optional)
-- pgvector for vector similarity search
-- tsvector for full-text search (replaces BM25 when using PostgreSQL)
-- Hybrid retrieval (RRF fusion) works with PostgreSQL backend
-- Backend selection via YAML provider config
-- Docker Compose for local PostgreSQL + pgvector development setup
-- E2E tests for PostgreSQL backend
+No active milestone. Next: v7.0+ (AWS Bedrock / Vertex AI providers).
 
 ### Out of Scope
 
@@ -85,17 +109,19 @@ Agent Brain is a local-first RAG (Retrieval-Augmented Generation) service that i
 
 ## Context
 
-**Current State (v3.0 shipped 2026-02-10):**
-- 12,858 LOC server Python + 13,171 LOC tests
-- 505 tests passing, 70% coverage
+**Current State (v6.0 completed 2026-02-13):**
+- ~3,200 lines PostgreSQL backend code across 12+ files
+- 675 tests passing (153 PostgreSQL-specific), 73% server coverage
+- Dual-backend architecture: ChromaDB (default) + PostgreSQL (optional)
+- pgvector for vector search, tsvector for full-text search
 - 7 embedding/summarization/reranking providers supported
-- Full GraphRAG with schema-based entity types
-- CI with provider matrix testing
+- Full GraphRAG with schema-based entity types (ChromaDB only)
+- CI with provider matrix testing + PostgreSQL service container
 
 **Technology Stack:**
 - Python 3.10+ with Poetry packaging
 - FastAPI + Uvicorn server
-- ChromaDB vector store
+- ChromaDB vector store (default) + PostgreSQL/pgvector (optional)
 - LlamaIndex for document processing
 - Pluggable providers: OpenAI, Anthropic, Ollama, Cohere, Gemini, Grok, SentenceTransformers
 
@@ -129,6 +155,12 @@ Agent Brain is a local-first RAG (Retrieval-Augmented Generation) service that i
 | JSONL job queue over Redis | Local-first, no external dependencies | ✓ Good |
 | Minimal FastAPI app for health endpoint tests | Avoids ChromaDB initialization in test environment | ✓ Good |
 | CI matrix with conditional API key checks | Tests skip gracefully, config tests always run | ✓ Good |
+| StorageBackendProtocol abstraction | Clean separation, contract-testable, dual-backend support | ✓ Good |
+| pgvector + tsvector over BM25 for PostgreSQL | Native DB features, no separate index files, better scaling | ✓ Good |
+| Async SQLAlchemy for PostgreSQL connections | Non-blocking I/O, connection pooling built-in | ✓ Good |
+| RRF fusion for PostgreSQL hybrid search | Same algorithm as ChromaDB, consistent cross-backend behavior | ✓ Good |
+| GraphRAG stays ChromaDB-only | Avoids complexity, deferred to future milestone | ✓ Good |
+| Conditional ChromaDB init in main.py lifespan | PostgreSQL backend skips ChromaDB setup entirely | ✓ Good |
 
 ---
-*Last updated: 2026-02-12 after v6.0 milestone started*
+*Last updated: 2026-02-13 after v6.0 milestone completed*
diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md
@@ -6,7 +6,7 @@
 ## Milestones
 
 - ✅ **v3.0 Advanced RAG** — Phases 1-4 (shipped 2026-02-10)
-- 🚧 **v6.0 PostgreSQL Backend** — Phases 5-10 (gap closure in progress)
+- ✅ **v6.0 PostgreSQL Backend** — Phases 5-10 (shipped 2026-02-13)
 
 ## Phases
 
@@ -22,7 +22,7 @@
 
 </details>
 
-### 🚧 v6.0 PostgreSQL Backend (Gap Closure)
+### ✅ v6.0 PostgreSQL Backend — SHIPPED 2026-02-13
 
 **Milestone Goal:** Add PostgreSQL as a configurable storage backend with pgvector for vector search and tsvector for full-text search, running alongside ChromaDB as a dual-backend architecture.
 
@@ -190,4 +190,4 @@ Feature 101: AST-aware code ingestion, code summaries
 
 ---
 *Roadmap created: 2026-02-07*
-*Last updated: 2026-02-13 — Phase 10 planned (1 plan)*
+*Last updated: 2026-02-13 — v6.0 milestone completed (all 10 phases shipped)*