Skip to content

feat(memory): pluggable storage backends#998

Open
erma07 wants to merge 1 commit intoRightNow-AI:mainfrom
erma07:feat/pluggable-memory-backends
Open

feat(memory): pluggable storage backends#998
erma07 wants to merge 1 commit intoRightNow-AI:mainfrom
erma07:feat/pluggable-memory-backends

Conversation

@erma07
Copy link
Copy Markdown

@erma07 erma07 commented Apr 6, 2026

Summary

Redesign the openfang-memory crate with pluggable storage backends. The main storage backend (SQLite or PostgreSQL) and the semantic/vector backend (SQLite, PostgreSQL, Qdrant, HTTP) are now independently configurable, allowing mix-and-match deployments like PostgreSQL for structured data with Qdrant for vector search.

Architecture

The orchestration layer (substrate.rs) is now 100% backend-agnostic — zero database-specific imports. All storage is abstracted through 9 backend traits with implementations per database:

Trait SQLite PostgreSQL Qdrant HTTP
Structured x x
Semantic x x x x
Knowledge x x
Session x x
Usage x x
PairedDevices x x
TaskQueue x x
Consolidation x x
Audit x x

Each backend lives in its own folder (sqlite/, postgres/, qdrant/, http/) with identical file structure (11 files each for SQLite and PostgreSQL). Shared serialization and parsing logic is extracted into helpers.rs to eliminate cross-backend duplication.

SessionBackend uses Rust default trait implementations for 5 methods (create_session, create_session_with_label, append_canonical, canonical_context, store_llm_summary) — new backends only need to implement the storage primitives.

Changes

  • New backend traits: PairedDevicesBackend, TaskQueueBackend, ConsolidationBackend, AuditBackend (added to existing StructuredBackend, SemanticBackend, KnowledgeBackend, SessionBackend, UsageBackend)
  • PostgreSQL: full parity with SQLite — all 9 backends implemented with pgvector for vectors, versioned migrations matching SQLite v1–v9
  • Qdrant: semantic-only backend via gRPC with auto-collection creation and cosine similarity search
  • HTTP: semantic-only backend routing to a remote memory-api gateway with automatic fallback to local storage
  • Folder restructure: SQLite code moved from top-level into sqlite/, all database-specific code isolated in its backend folder
  • External callers migrated: kernel, API routes, and runtime now use trait-based APIs (Arc<dyn UsageBackend>, AuditBackend) — no more leaked rusqlite::Connection types
  • New config field: semantic_backend allows independent vector backend selection
  • Docker Compose: added pgvector/pgvector:pg18 and qdrant/qdrant:latest services for integration testing

Testing

cargo clippy --workspace --all-targets -- -D warnings passes
cargo test --workspace passes (65 tests: 40 unit + 25 integration)
Integration tested against live PostgreSQL (pgvector/pg18) and Qdrant
All SQLite, PostgreSQL, and Qdrant backends verified with identical test suite

Security

No new unsafe code (existing sqlite-vec FFI registration unchanged)
No secrets or API keys in diff
User input validated at boundaries
usage_conn() removed — no more raw database connection leaks to external crates

Configuration

[memory]
backend = "sqlite"              # main storage: "sqlite" or "postgres"
semantic_backend = "qdrant"     # vector search: "sqlite", "postgres", "qdrant", or "http"

postgres_url = "postgresql://user:pass@localhost/openfang"
qdrant_url = "http://localhost:6334"


Copy link
Copy Markdown
Member

@jaberjaber23 jaberjaber23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security Audit — APPROVED (with rebase needed)

Verdict: APPROVE — clean, well-architected refactor. Cannot merge due to conflicts.

Security findings:

  1. No data exfiltration — The HTTP semantic backend only routes to a user-configured memory_api_url from config, with automatic fallback to local storage on failure. No hardcoded external endpoints.

  2. No SQL injection — PostgreSQL backend uses parameterized queries ($1, $2 etc.) throughout. No format!() with SQL + user data found.

  3. Unsafe blocks are legitimate — All 5 unsafe blocks are for sqlite_vec::sqlite3_vec_init FFI registration (SQLite vector extension). This is an existing pattern that was moved, not introduced. The transmute is required for the SQLite auto-extension API.

  4. No secrets in diff — Docker Compose test credentials (POSTGRES_PASSWORD: openfang) are for local integration testing only, under the test/db profile. Acceptable.

  5. No new unsafe code paths — The trait-based architecture is actually more secure than before: usage_conn() which leaked raw rusqlite::Connection to external crates has been removed. All external callers now go through Arc<dyn UsageBackend> trait objects.

  6. Line count justified — 7171 additions across 22 Rust files + Cargo.lock changes. The breakdown: ~1300 lines Cargo.lock dependency updates, ~2000 lines PostgreSQL backend (11 files mirroring SQLite), ~700 lines Qdrant backend, ~300 lines HTTP backend rename/extension, ~800 lines backend traits + helpers, ~500 lines migration code, ~500 lines tests, ~1000 lines SQLite file reorganization (moved, not new).

  7. Qdrant backend — Uses official qdrant-client crate via gRPC. Auto-creates collections with cosine similarity. No authentication bypass.

The PR has merge conflicts (DIRTY state). Author needs to rebase before merge.

@jaberjaber23
Copy link
Copy Markdown
Member

This PR has merge conflicts. Please rebase onto the latest main branch and resolve conflicts so we can merge.

…t support

Redesign the openfang-memory crate for pluggable storage backends.
The main backend (sqlite or postgres) and semantic backend (sqlite,
postgres, qdrant, http) are independently configurable.

Architecture:
- substrate.rs is 100% backend-agnostic (zero rusqlite imports)
- 9 backend traits: Structured, Semantic, Knowledge, Session, Usage,
  PairedDevices, TaskQueue, Consolidation, Audit
- SessionBackend has 5 default trait impls (create_session,
  canonical_context, append_canonical, store_llm_summary, etc.)
- Shared helpers.rs for serialization/parsing across backends
- JSONL session mirror extracted to standalone filesystem utility

Backends:
- sqlite/  — 11 files, full implementation with sqlite-vec vectors
- postgres/ — 11 files, full implementation with pgvector
- qdrant/  — semantic-only, gRPC vector similarity search
- http/    — semantic-only, remote memory-api gateway with fallback

External callers migrated:
- kernel uses memory.usage_arc() and memory.audit() (was usage_conn())
- api routes use memory.usage() trait method
- runtime AuditLog uses AuditBackend trait (was raw rusqlite Connection)
- MeteringEngine accepts Arc<dyn UsageBackend> (was Arc<UsageStore>)

Config:
  [memory]
  backend = "sqlite"           # or "postgres"
  semantic_backend = "qdrant"  # independently: sqlite/postgres/qdrant/http

Docker: pgvector/pg18 + qdrant services for integration testing.
65 tests (40 unit + 25 integration) across all backends.
@erma07 erma07 force-pushed the feat/pluggable-memory-backends branch from f2ec259 to d42e5f7 Compare April 11, 2026 05:29
@erma07
Copy link
Copy Markdown
Author

erma07 commented Apr 12, 2026

resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants