Skip to content

Commit de14282

Browse files
authored
Merge pull request #16 from alex-feel/alex-feel-dev
Add PostgreSQL backend with pgvector semantic search
2 parents 4ba331d + 1f1600e commit de14282

49 files changed

Lines changed: 5598 additions & 2371 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CLAUDE.md

Lines changed: 234 additions & 35 deletions
Large diffs are not rendered by default.

CONTRIBUTING.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,9 @@ uv run pytest tests/test_server.py
5252
uv run pytest tests/test_metadata_filtering.py -v
5353
uv run pytest tests/test_metadata_error_handling.py -v
5454

55+
# Run semantic search tests
56+
uv run pytest tests/test_semantic_search_filters.py -v
57+
5558
# Run integration tests only
5659
uv run pytest -m integration
5760

@@ -176,7 +179,7 @@ The server uses a clean repository pattern to separate concerns:
176179
- **TagRepository**: Handles tag normalization, relationships, and tag replacement for updates
177180
- **ImageRepository**: Manages multimodal attachments and image replacement for updates
178181
- **StatisticsRepository**: Provides metrics and thread statistics
179-
- **DatabaseConnectionManager**: Thread-safe connection pooling with retry logic
182+
- **StorageBackend**: Thread-safe connection pooling with retry logic
180183
- **MetadataQueryBuilder**: Secure SQL generation for metadata filtering with 15 operators
181184
- **MetadataFilter**: Type-safe filter specifications with validation
182185

README.md

Lines changed: 101 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,9 @@ A high-performance Model Context Protocol (MCP) server providing persistent mult
1111
- **Thread-Based Scoping**: Agents working on the same task share context through thread IDs
1212
- **Flexible Metadata Filtering**: Store custom structured data with any JSON-serializable fields and filter using 15 powerful operators
1313
- **Tag-Based Organization**: Efficient context retrieval with normalized, indexed tags
14-
- **Semantic Search**: Optional vector similarity search using EmbeddingGemma for meaning-based retrieval
15-
- **High Performance**: SQLite with WAL mode, strategic indexing, and async operations
14+
- **Semantic Search**: Optional vector similarity search for meaning-based retrieval
15+
- **Multiple Database Backends**: Choose between SQLite (default, zero-config) or PostgreSQL (high-concurrency, production-grade)
16+
- **High Performance**: WAL mode (SQLite) / MVCC (PostgreSQL), strategic indexing, and async operations
1617
- **MCP Standard Compliance**: Works with Claude Code, LangGraph, and any MCP-compatible client
1718
- **Production Ready**: Comprehensive test coverage, type safety, and robust error handling
1819

@@ -33,6 +34,12 @@ claude mcp add context-server -- uvx mcp-context-server
3334

3435
# Or from GitHub (latest development version)
3536
claude mcp add context-server -- uvx --from git+https://github.com/alex-feel/mcp-context-server mcp-context-server
37+
38+
# Or with semantic search (for setup instructions, see the docs/semantic-search.md)
39+
claude mcp add context-server -- uvx --with mcp-context-server[semantic-search] mcp-context-server
40+
41+
# Or from GitHub (latest development version) with semantic search (for setup instructions, see docs/semantic-search.md)
42+
claude mcp add context-server -- uvx --from git+https://github.com/alex-feel/mcp-context-server --with mcp-context-server[semantic-search] mcp-context-server
3643
```
3744

3845
For more details, see: https://docs.claude.com/en/docs/claude-code/mcp#option-1%3A-add-a-local-stdio-server
@@ -88,7 +95,7 @@ Example configuration with environment variables:
8895
"args": ["mcp-context-server"],
8996
"env": {
9097
"LOG_LEVEL": "${LOG_LEVEL:-INFO}",
91-
"MCP_CONTEXT_DB": "${MCP_CONTEXT_DB:-~/.mcp/context_storage.db}",
98+
"DB_PATH": "${DB_PATH:-~/.mcp/context_storage.db}",
9299
"MAX_IMAGE_SIZE_MB": "${MAX_IMAGE_SIZE_MB:-10}",
93100
"MAX_TOTAL_SIZE_MB": "${MAX_TOTAL_SIZE_MB:-100}"
94101
}
@@ -101,15 +108,26 @@ For more details on environment variable expansion, see: https://docs.claude.com
101108

102109
### Supported Environment Variables
103110

111+
**Core Settings:**
112+
- **STORAGE_BACKEND**: Database backend - `sqlite` (default) or `postgresql`
104113
- **LOG_LEVEL**: Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) - defaults to INFO
105-
- **MCP_CONTEXT_DB**: Database file location - defaults to ~/.mcp/context_storage.db
114+
- **DB_PATH**: Database file location (SQLite only) - defaults to ~/.mcp/context_storage.db
106115
- **MAX_IMAGE_SIZE_MB**: Maximum size per image in MB - defaults to 10
107116
- **MAX_TOTAL_SIZE_MB**: Maximum total request size in MB - defaults to 100
117+
118+
**Semantic Search Settings:**
108119
- **ENABLE_SEMANTIC_SEARCH**: Enable semantic search functionality (true/false) - defaults to false
109120
- **OLLAMA_HOST**: Ollama API host URL for embedding generation - defaults to http://localhost:11434
110121
- **EMBEDDING_MODEL**: Embedding model name for semantic search - defaults to embeddinggemma:latest
111122
- **EMBEDDING_DIM**: Embedding vector dimensions - defaults to 768. **Note**: Changing this after initial setup requires database migration (see [Semantic Search Guide](docs/semantic-search.md#changing-embedding-dimensions))
112123

124+
**PostgreSQL Settings** (only when STORAGE_BACKEND=postgresql):
125+
- **POSTGRESQL_HOST**: PostgreSQL server host - defaults to localhost
126+
- **POSTGRESQL_PORT**: PostgreSQL server port - defaults to 5432
127+
- **POSTGRESQL_USER**: PostgreSQL username - defaults to postgres
128+
- **POSTGRESQL_PASSWORD**: PostgreSQL password - defaults to postgres
129+
- **POSTGRESQL_DATABASE**: PostgreSQL database name - defaults to mcp_context
130+
113131
### Advanced Configuration
114132

115133
Additional environment variables are available for advanced server tuning, including:
@@ -125,6 +143,81 @@ For a complete list of all configuration options, see [app/settings.py](app/sett
125143

126144
For detailed instructions on enabling optional semantic search with Ollama and EmbeddingGemma, see the [Semantic Search Guide](docs/semantic-search.md).
127145

146+
## Database Backends
147+
148+
The server supports multiple database backends, selectable via the `STORAGE_BACKEND` environment variable.
149+
150+
### SQLite (Default)
151+
152+
Zero-configuration local storage, perfect for single-user deployments.
153+
154+
**Features:**
155+
- No installation required - works out of the box
156+
- Production-grade connection pooling and write queue
157+
- WAL mode for better concurrency
158+
- Suitable for single-user and moderate workloads
159+
160+
**Configuration:** No configuration needed - just start the server!
161+
162+
### PostgreSQL
163+
164+
High-performance backend for multi-user and high-traffic deployments.
165+
166+
**Features:**
167+
- 10x+ write throughput vs SQLite via MVCC
168+
- Native concurrent write support
169+
- JSONB indexing for fast metadata queries
170+
- Production-grade connection pooling with asyncpg
171+
- pgvector extension for semantic search
172+
173+
**Quick Start with Docker:**
174+
175+
Running PostgreSQL with pgvector is incredibly simple - just 2 commands:
176+
177+
```bash
178+
# 1. Pull and run PostgreSQL with pgvector (all-in-one)
179+
docker run --name pgvector18 \
180+
-e POSTGRES_USER=postgres \
181+
-e POSTGRES_PASSWORD=postgres \
182+
-e POSTGRES_DB=mcp_context \
183+
-p 5432:5432 \
184+
-d pgvector/pgvector:pg18-trixie
185+
186+
# 2. Configure the server (minimal setup - just 2 variables)
187+
export STORAGE_BACKEND=postgresql
188+
export ENABLE_SEMANTIC_SEARCH=true # Optional: only if you need semantic search
189+
```
190+
191+
**That's it!** The server will automatically:
192+
- Connect to PostgreSQL on startup
193+
- Initialize the schema (creates tables and indexes)
194+
- Enable pgvector extension (comes pre-installed in the Docker image)
195+
- Apply semantic search migration if enabled
196+
197+
**Configuration in .mcp.json:**
198+
199+
```json
200+
{
201+
"mcpServers": {
202+
"context-server": {
203+
"type": "stdio",
204+
"command": "uvx",
205+
"args": ["mcp-context-server"],
206+
"env": {
207+
"STORAGE_BACKEND": "postgresql",
208+
"POSTGRESQL_HOST": "localhost",
209+
"POSTGRESQL_USER": "postgres",
210+
"POSTGRESQL_PASSWORD": "postgres",
211+
"POSTGRESQL_DATABASE": "mcp_context",
212+
"ENABLE_SEMANTIC_SEARCH": "true"
213+
}
214+
}
215+
}
216+
}
217+
```
218+
219+
**Note:** PostgreSQL settings are only needed when using PostgreSQL. The server uses SQLite by default if `STORAGE_BACKEND` is not set.
220+
128221
## API Reference
129222

130223
### Tools
@@ -275,11 +368,13 @@ Update specific fields of an existing context entry.
275368
- List of updated fields
276369
- Success/error message
277370

278-
#### semantic_search_tool
371+
#### semantic_search_context
279372

280373
Perform semantic similarity search using vector embeddings.
281374

282-
Note: This tool is only available when semantic search is enabled via `ENABLE_SEMANTIC_SEARCH=true` and all dependencies are installed (ollama, numpy, sqlite-vec packages, and EmbeddingGemma model).
375+
Note: This tool is only available when semantic search is enabled via `ENABLE_SEMANTIC_SEARCH=true` and all dependencies are installed. The implementation varies by backend:
376+
- **SQLite**: Uses sqlite-vec extension with embedding model via Ollama
377+
- **PostgreSQL**: Uses pgvector extension (pre-installed in pgvector Docker image) with embedding model via Ollama
283378

284379
**Parameters:**
285380
- `query` (str, required): Natural language search query

app/backends/__init__.py

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
"""
2+
Storage backend implementations for mcp-context-server.
3+
4+
This package provides a protocol-based abstraction for different database backends,
5+
enabling support for SQLite, PostgreSQL, Supabase, and other storage systems.
6+
7+
Key Components:
8+
- StorageBackend: Protocol defining the interface for all storage backends
9+
- SQLiteBackend: Implementation for SQLite database
10+
- create_backend: Factory function for creating backend instances
11+
12+
Example Usage:
13+
from app.backends import create_backend
14+
15+
# Create SQLite backend
16+
backend = create_backend(backend_type='sqlite', db_path='/path/to/db.sqlite')
17+
await backend.initialize()
18+
19+
# Use with repositories
20+
repositories = RepositoryContainer(backend)
21+
"""
22+
23+
from app.backends.base import StorageBackend
24+
from app.backends.factory import create_backend
25+
from app.backends.sqlite_backend import SQLiteBackend
26+
27+
__all__ = ['StorageBackend', 'SQLiteBackend', 'create_backend']

0 commit comments

Comments
 (0)