🔖 Feature description
Add Valkey as a pluggable vector store backend for DocsGPT, using the valkey-glide client and the valkey-search module for HNSW-based vector similarity search.
🎤 Why is this feature needed ?
DocsGPT currently supports FAISS, PostgreSQL, Elasticsearch, Qdrant, and Milvus as vector store backends. Valkey fills a gap for users who:
- Already run Valkey (or Redis-compatible infrastructure) for caching, sessions, or Celery brokering and want to consolidate their stack — one service for both cache and vector search
- Need sub-millisecond vector search without deploying a dedicated vector database
- Want an open-source, BSL-free alternative to Redis with full vector search capabilities
- Want native integration without an additional service
Example use case: A team running DocsGPT with Celery already has Valkey as the broker. Adding vector search to the same instance eliminates an entire service from their deployment (no separate Qdrant/Milvus container needed).
✌️ How do you aim to achieve this?
- Add a ValkeyStore class in application/vectorstore/ implementing the existing VectorStoreBase interface
- Use valkey-glide (synchronous GLIDE client) for connections — the officially recommended Valkey client with async support, cluster mode, and IAM auth
- Use valkey-search module commands (FT.CREATE, FT.SEARCH) for HNSW vector indexing and KNN retrieval
- Source isolation via TAG field filtering on source_id — multiple document sources share one index without mixing results
- Register in the vector store factory so VECTOR_STORE=valkey activates it
- Configuration via environment variables: VALKEY_HOST, VALKEY_PORT, VALKEY_PASSWORD, VALKEY_USE_TLS, VALKEY_INDEX_NAME, VALKEY_PREFIX
- Unit tests (mocked) and integration tests (live Valkey container)
🔄️ Additional Information
Alternatives considered:
- redis-py with Valkey: Valkey is wire-compatible with Redis, so redis-py works. However, valkey-glide is the purpose-built client maintained by the Valkey project — it supports cluster mode, async, connection pooling, and IAM auth out of the box without relying on Redis client compatibility that may diverge over time.
- LangChain's ValkeyVectorStore: DocsGPT has its own vector store abstraction (VectorStoreBase) rather than using LangChain's. Implementing directly against the DocsGPT interface keeps the dependency tree minimal and follows the existing pattern (FAISS, Qdrant, etc. all implement VectorStoreBase directly).
- Valkey JSON + vector: Using HASH-based storage (like the existing Elasticsearch backend) rather than JSON documents. HASHes are simpler, use less memory, and valkey-search indexes them natively.
Requirements:
- Valkey 8.0+ with the search module loaded (available in valkey/valkey-bundle Docker image)
- valkey-glide Python package (added to requirements.txt)
👀 Have you spent some time to check if this feature request has been raised before?
Are you willing to submit PR?
Yes I am willing to submit a PR!
🔖 Feature description
Add Valkey as a pluggable vector store backend for DocsGPT, using the valkey-glide client and the valkey-search module for HNSW-based vector similarity search.
🎤 Why is this feature needed ?
DocsGPT currently supports FAISS, PostgreSQL, Elasticsearch, Qdrant, and Milvus as vector store backends. Valkey fills a gap for users who:
Example use case: A team running DocsGPT with Celery already has Valkey as the broker. Adding vector search to the same instance eliminates an entire service from their deployment (no separate Qdrant/Milvus container needed).
✌️ How do you aim to achieve this?
🔄️ Additional Information
Alternatives considered:
Requirements:
👀 Have you spent some time to check if this feature request has been raised before?
Are you willing to submit PR?
Yes I am willing to submit a PR!