Skip to content

[Feat] RAG Ingestion 3/11: Vector Store Backends (Milvus + In-Memory) #1266

@yossiovadia

Description

@yossiovadia

Summary

Implement the VectorStoreBackend interface for Milvus (production) and in-memory HNSW (dev/testing).

Part of #1262

Scope

Files to create:

  • pkg/vectorstore/milvus_backend.go - Milvus implementation
  • pkg/vectorstore/memory_backend.go - In-memory HNSW implementation
  • pkg/vectorstore/factory.go - Backend factory
  • pkg/vectorstore/milvus_backend_test.go
  • pkg/vectorstore/memory_backend_test.go

Key Design Decisions

Milvus backend:

  • Collection naming: vsr_vs_{vectorStoreID}
  • Schema: id, file_id, filename, content, embedding (FLOAT_VECTOR), chunk_index, created_at
  • HNSW index on embedding field (IP metric)
  • Reuses Milvus client patterns from pkg/cache/milvus_cache.go

In-memory backend:

  • Uses existing pkg/hnsw/ HNSW index
  • Map per vector store: { hnswIndex, docs map[string]EmbeddedChunk }
  • No persistence (dev/testing only)

Factory:

  • NewBackend(backendType, config) returns appropriate implementation

Acceptance Criteria

  • Milvus backend creates/deletes collections with correct schema
  • Milvus backend inserts chunks and searches with similarity threshold
  • In-memory backend stores and searches using HNSW index
  • Factory creates correct backend based on config
  • Both backends implement full VectorStoreBackend interface
  • Tests for both backends (Milvus tests skip without SKIP_MILVUS_TESTS=false)

Dependencies

Depends on PR 1 (interface definition).

Deferred

  • Redis backend

Branch: feat/rag-03-backendsfeat/rag-ingestion

Metadata

Metadata

Assignees

Labels

athenav0.2 Athena milestone taskspriority/P1Important / Should-Have

Type

No type

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions