feat(memory): pluggable storage backends#998
Conversation
jaberjaber23
left a comment
There was a problem hiding this comment.
Security Audit — APPROVED (with rebase needed)
Verdict: APPROVE — clean, well-architected refactor. Cannot merge due to conflicts.
Security findings:
-
No data exfiltration — The HTTP semantic backend only routes to a user-configured
memory_api_urlfrom config, with automatic fallback to local storage on failure. No hardcoded external endpoints. -
No SQL injection — PostgreSQL backend uses parameterized queries (
$1,$2etc.) throughout. Noformat!()with SQL + user data found. -
Unsafe blocks are legitimate — All 5
unsafeblocks are forsqlite_vec::sqlite3_vec_initFFI registration (SQLite vector extension). This is an existing pattern that was moved, not introduced. The transmute is required for the SQLite auto-extension API. -
No secrets in diff — Docker Compose test credentials (
POSTGRES_PASSWORD: openfang) are for local integration testing only, under thetest/dbprofile. Acceptable. -
No new unsafe code paths — The trait-based architecture is actually more secure than before:
usage_conn()which leaked rawrusqlite::Connectionto external crates has been removed. All external callers now go throughArc<dyn UsageBackend>trait objects. -
Line count justified — 7171 additions across 22 Rust files + Cargo.lock changes. The breakdown: ~1300 lines Cargo.lock dependency updates, ~2000 lines PostgreSQL backend (11 files mirroring SQLite), ~700 lines Qdrant backend, ~300 lines HTTP backend rename/extension, ~800 lines backend traits + helpers, ~500 lines migration code, ~500 lines tests, ~1000 lines SQLite file reorganization (moved, not new).
-
Qdrant backend — Uses official
qdrant-clientcrate via gRPC. Auto-creates collections with cosine similarity. No authentication bypass.
The PR has merge conflicts (DIRTY state). Author needs to rebase before merge.
|
This PR has merge conflicts. Please rebase onto the latest main branch and resolve conflicts so we can merge. |
…t support Redesign the openfang-memory crate for pluggable storage backends. The main backend (sqlite or postgres) and semantic backend (sqlite, postgres, qdrant, http) are independently configurable. Architecture: - substrate.rs is 100% backend-agnostic (zero rusqlite imports) - 9 backend traits: Structured, Semantic, Knowledge, Session, Usage, PairedDevices, TaskQueue, Consolidation, Audit - SessionBackend has 5 default trait impls (create_session, canonical_context, append_canonical, store_llm_summary, etc.) - Shared helpers.rs for serialization/parsing across backends - JSONL session mirror extracted to standalone filesystem utility Backends: - sqlite/ — 11 files, full implementation with sqlite-vec vectors - postgres/ — 11 files, full implementation with pgvector - qdrant/ — semantic-only, gRPC vector similarity search - http/ — semantic-only, remote memory-api gateway with fallback External callers migrated: - kernel uses memory.usage_arc() and memory.audit() (was usage_conn()) - api routes use memory.usage() trait method - runtime AuditLog uses AuditBackend trait (was raw rusqlite Connection) - MeteringEngine accepts Arc<dyn UsageBackend> (was Arc<UsageStore>) Config: [memory] backend = "sqlite" # or "postgres" semantic_backend = "qdrant" # independently: sqlite/postgres/qdrant/http Docker: pgvector/pg18 + qdrant services for integration testing. 65 tests (40 unit + 25 integration) across all backends.
f2ec259 to
d42e5f7
Compare
|
resolved |
Summary
Redesign the
openfang-memorycrate with pluggable storage backends. The main storage backend (SQLite or PostgreSQL) and the semantic/vector backend (SQLite, PostgreSQL, Qdrant, HTTP) are now independently configurable, allowing mix-and-match deployments like PostgreSQL for structured data with Qdrant for vector search.Architecture
The orchestration layer (
substrate.rs) is now 100% backend-agnostic — zero database-specific imports. All storage is abstracted through 9 backend traits with implementations per database:Each backend lives in its own folder (
sqlite/,postgres/,qdrant/,http/) with identical file structure (11 files each for SQLite and PostgreSQL). Shared serialization and parsing logic is extracted intohelpers.rsto eliminate cross-backend duplication.SessionBackenduses Rust default trait implementations for 5 methods (create_session,create_session_with_label,append_canonical,canonical_context,store_llm_summary) — new backends only need to implement the storage primitives.Changes
PairedDevicesBackend,TaskQueueBackend,ConsolidationBackend,AuditBackend(added to existingStructuredBackend,SemanticBackend,KnowledgeBackend,SessionBackend,UsageBackend)sqlite/, all database-specific code isolated in its backend folderArc<dyn UsageBackend>,AuditBackend) — no more leakedrusqlite::Connectiontypessemantic_backendallows independent vector backend selectionpgvector/pgvector:pg18andqdrant/qdrant:latestservices for integration testingTesting
cargo clippy --workspace --all-targets -- -D warnings passes
cargo test --workspace passes (65 tests: 40 unit + 25 integration)
Integration tested against live PostgreSQL (pgvector/pg18) and Qdrant
All SQLite, PostgreSQL, and Qdrant backends verified with identical test suite
Security
No new unsafe code (existing sqlite-vec FFI registration unchanged)
No secrets or API keys in diff
User input validated at boundaries
usage_conn() removed — no more raw database connection leaks to external crates
Configuration