Skip to content

Latest commit

 

History

History
158 lines (113 loc) · 7.98 KB

File metadata and controls

158 lines (113 loc) · 7.98 KB

Architecture

System overview

Context diagram: operators and external systems interact with the Auris platform, which orchestrates retrieval and generation while persisting vectors and sparse indexes.

flowchart LR
  developer(["Person\nDeveloper"])
  auris(["Software System\nAuris RAG engine"])
  documents(["Software System\nDocuments and files"])
  vectorStore[("Data store\nVector index")]
  sparseStore[("Data store\nSparse index")]
  llmVendor(["Software System\nLLM vendor"])
  observability(["Software System\nObservability backend"])

  developer -->|HTTPS JSON| auris
  documents -->|Ingestion| auris
  auris -->|Embeddings and queries| vectorStore
  auris -->|Lexical mirrors| sparseStore
  auris -->|Completions| llmVendor
  auris -->|Traces and metrics| observability
Loading

Pipeline architecture

Query path

flowchart LR
  http[HTTP request]
  http --> inG[Input guardrails]

  subgraph retrieval ["Retrieval"]
    qe[Query expansion]
    dense[Dense retrieval]
    sparse[Sparse retrieval]
    rrf[RRF fusion]
    rr[Reranker]
    pack[Context packing]
  end

  subgraph generation ["Generation"]
    pr[Prompt render]
    llmN[LLM completion]
  end

  subgraph postgen ["Post-generation"]
    outG[Output guardrails]
  end

  inG --> qe
  qe --> dense
  qe --> sparse
  dense --> rrf
  sparse --> rrf
  rrf --> rr
  rr --> pack
  pack --> pr
  pr --> llmN
  llmN --> outG
  outG --> resp[Response]
Loading

Indexing path

flowchart LR
  dsFetch[Data source fetch]
  chunk[Chunking]
  emb[Embedding]
  vec[Vector store write]
  meta[Index state write]

  dsFetch --> chunk
  chunk --> emb
  emb --> vec
  vec --> meta
Loading

Adapter boundaries

Interface Lives in Default implementation Swap cost
DataSourceProvider packages/core/src/data-sources/DataSourceProvider.ts LocalFilesDataSourceProvider one module
DocumentLoader packages/core/src/ports/document-loader.ts CompositeDocumentLoader one module
ChunkingStrategy packages/core/src/ports/chunking-strategy.ts Recursive and semantic sentence chunkers one module
EmbeddingProvider packages/core/src/ports/embedding-provider.ts OpenAiEmbeddingProvider or StubEmbeddingProvider one module
VectorStore packages/core/src/ports/vector-store.ts ChromaVectorStore one module
SparseSearchProvider packages/core/src/ports/sparse-search-provider.ts Elasticsearch, Typesense, or noop configuration
FusionStrategy packages/core/src/ports/fusion-strategy.ts Reciprocal rank fusion helper one module
RerankerProvider packages/core/src/ports/reranker-provider.ts CohereRerankerProvider or NoopRerankerProvider configuration
QueryExpander packages/core/src/ports/query-expander.ts HyDeMultiQueryExpander one module
ContextPacker packages/core/src/ports/context-packer.ts TokenBudgetContextPacker one module
LLMProvider packages/core/src/ports/llm-provider.ts OpenAiLlmProvider or StubLlmProvider one module
GuardrailProvider packages/core/src/ports/guardrail-provider.ts CompositeGuardrailProvider one module
PromptTemplateStore packages/core/src/ports/prompt-template-store.ts FilesystemPromptTemplateStore one module
IndexStateStore packages/core/src/ports/index-state-store.ts FileIndexStateStore one module
TracingProvider packages/core/src/ports/tracing-provider.ts ConsoleTracingProvider with optional vendor tags configuration
EscalationProvider packages/core/src/ports/escalation-provider.ts WebhookEscalationProvider or LoggingEscalationProvider configuration

ADR-1: Hexagonal architecture for every external service

Status: Accepted

Context: Retrieval stacks integrate several vendors that change on different release cadences.

Decision: Model every integration as a core port with infra-level adapter implementations.

Consequences: Teams can mock dependencies without rewriting orchestration, and each new vendor ships as a focused adapter instead of a pipeline fork.

ADR-2: Monorepo boundaries with TypeScript project references

Status: Accepted

Context: Shared DTOs and configuration must stay aligned between the API, workers, and evaluation tooling.

Decision: Split the codebase into shared, core, infra, api, workers, and evaluation packages with explicit references.

Consequences: Contract drift becomes a compile error, and build graphs stay predictable for CI.

ADR-3: Hybrid retrieval (dense + sparse + RRF) as the default strategy

Status: Accepted

Context: Pure vector search misses lexical cues while pure BM25 misses semantic paraphrases.

Decision: Run dense and sparse searches in parallel and fuse with reciprocal rank fusion by default.

Consequences: Answers improve on keyword-heavy and synonym-heavy queries, while operating two indexes increases infrastructure cost.

ADR-4: Domain-agnostic DataSourceProvider interface

Status: Accepted

Context: Ingestion requirements change per dataset while chunking and embedding stay stable.

Decision: Move source-specific knowledge into infra adapters configured through DATA_SOURCE (comma-separated file extensions for the local files driver).

Consequences: Core pipelines stay reusable across verticals, and every dataset gets an explicit adapter instead of hidden conditionals.

ADR-5: OpenTelemetry-compatible tracing at pipeline boundaries

Status: Accepted

Context: Debugging RAG failures requires knowing which stage hurt recall or triggered guardrails.

Decision: Wrap each major stage with tracedStep and keep identifiers compatible with OpenTelemetry naming.

Consequences: Operators can correlate API requests with vector and sparse latencies, while verbose tracing can increase log volume.

Data flow: worked example

A developer posts JSON to /v1/query, Express validates the payload, and executeRagQuery allocates a PipelineContext with a fresh traceId. GuardrailProvider scores the raw question, after which QueryExpander asks LLMProvider for a hypothetical answer plus several paraphrases. Each expanded string is embedded through EmbeddingProvider and sent to VectorStore while parallel calls hit SparseSearchProvider; FusionStrategy merges the ranked lists before RerankerProvider optionally reorders candidates.

ContextPacker trims the fused chunks to the configured token budget, FilesystemPromptTemplateStore loads the template referenced by promptTemplateId, and renderPrompt materializes the system message that LLMProvider completes. Another GuardrailProvider pass inspects the answer, EscalationProvider may enqueue a webhook when confidence is low, and the handler responds with text, citations, and pipeline metadata while the traceId ties spans together.

The indexing worker resolves a DataSourceProvider, streams DataSourceDocument records, converts them into RawDocument values, and hands them to indexRawDocuments, which consults IndexStateStore, optionally refreshes sparse mirrors, and records new fingerprints so the next sync skips unchanged files.