Context diagram: operators and external systems interact with the Auris platform, which orchestrates retrieval and generation while persisting vectors and sparse indexes.
flowchart LR
developer(["Person\nDeveloper"])
auris(["Software System\nAuris RAG engine"])
documents(["Software System\nDocuments and files"])
vectorStore[("Data store\nVector index")]
sparseStore[("Data store\nSparse index")]
llmVendor(["Software System\nLLM vendor"])
observability(["Software System\nObservability backend"])
developer -->|HTTPS JSON| auris
documents -->|Ingestion| auris
auris -->|Embeddings and queries| vectorStore
auris -->|Lexical mirrors| sparseStore
auris -->|Completions| llmVendor
auris -->|Traces and metrics| observability
Query path
flowchart LR
http[HTTP request]
http --> inG[Input guardrails]
subgraph retrieval ["Retrieval"]
qe[Query expansion]
dense[Dense retrieval]
sparse[Sparse retrieval]
rrf[RRF fusion]
rr[Reranker]
pack[Context packing]
end
subgraph generation ["Generation"]
pr[Prompt render]
llmN[LLM completion]
end
subgraph postgen ["Post-generation"]
outG[Output guardrails]
end
inG --> qe
qe --> dense
qe --> sparse
dense --> rrf
sparse --> rrf
rrf --> rr
rr --> pack
pack --> pr
pr --> llmN
llmN --> outG
outG --> resp[Response]
Indexing path
flowchart LR
dsFetch[Data source fetch]
chunk[Chunking]
emb[Embedding]
vec[Vector store write]
meta[Index state write]
dsFetch --> chunk
chunk --> emb
emb --> vec
vec --> meta
| Interface | Lives in | Default implementation | Swap cost |
|---|---|---|---|
DataSourceProvider |
packages/core/src/data-sources/DataSourceProvider.ts |
LocalFilesDataSourceProvider |
one module |
DocumentLoader |
packages/core/src/ports/document-loader.ts |
CompositeDocumentLoader |
one module |
ChunkingStrategy |
packages/core/src/ports/chunking-strategy.ts |
Recursive and semantic sentence chunkers | one module |
EmbeddingProvider |
packages/core/src/ports/embedding-provider.ts |
OpenAiEmbeddingProvider or StubEmbeddingProvider |
one module |
VectorStore |
packages/core/src/ports/vector-store.ts |
ChromaVectorStore |
one module |
SparseSearchProvider |
packages/core/src/ports/sparse-search-provider.ts |
Elasticsearch, Typesense, or noop | configuration |
FusionStrategy |
packages/core/src/ports/fusion-strategy.ts |
Reciprocal rank fusion helper | one module |
RerankerProvider |
packages/core/src/ports/reranker-provider.ts |
CohereRerankerProvider or NoopRerankerProvider |
configuration |
QueryExpander |
packages/core/src/ports/query-expander.ts |
HyDeMultiQueryExpander |
one module |
ContextPacker |
packages/core/src/ports/context-packer.ts |
TokenBudgetContextPacker |
one module |
LLMProvider |
packages/core/src/ports/llm-provider.ts |
OpenAiLlmProvider or StubLlmProvider |
one module |
GuardrailProvider |
packages/core/src/ports/guardrail-provider.ts |
CompositeGuardrailProvider |
one module |
PromptTemplateStore |
packages/core/src/ports/prompt-template-store.ts |
FilesystemPromptTemplateStore |
one module |
IndexStateStore |
packages/core/src/ports/index-state-store.ts |
FileIndexStateStore |
one module |
TracingProvider |
packages/core/src/ports/tracing-provider.ts |
ConsoleTracingProvider with optional vendor tags |
configuration |
EscalationProvider |
packages/core/src/ports/escalation-provider.ts |
WebhookEscalationProvider or LoggingEscalationProvider |
configuration |
Status: Accepted
Context: Retrieval stacks integrate several vendors that change on different release cadences.
Decision: Model every integration as a core port with infra-level adapter implementations.
Consequences: Teams can mock dependencies without rewriting orchestration, and each new vendor ships as a focused adapter instead of a pipeline fork.
Status: Accepted
Context: Shared DTOs and configuration must stay aligned between the API, workers, and evaluation tooling.
Decision: Split the codebase into shared, core, infra, api, workers, and evaluation packages with explicit references.
Consequences: Contract drift becomes a compile error, and build graphs stay predictable for CI.
Status: Accepted
Context: Pure vector search misses lexical cues while pure BM25 misses semantic paraphrases.
Decision: Run dense and sparse searches in parallel and fuse with reciprocal rank fusion by default.
Consequences: Answers improve on keyword-heavy and synonym-heavy queries, while operating two indexes increases infrastructure cost.
Status: Accepted
Context: Ingestion requirements change per dataset while chunking and embedding stay stable.
Decision: Move source-specific knowledge into infra adapters configured through DATA_SOURCE (comma-separated file extensions for the local files driver).
Consequences: Core pipelines stay reusable across verticals, and every dataset gets an explicit adapter instead of hidden conditionals.
Status: Accepted
Context: Debugging RAG failures requires knowing which stage hurt recall or triggered guardrails.
Decision: Wrap each major stage with tracedStep and keep identifiers compatible with OpenTelemetry naming.
Consequences: Operators can correlate API requests with vector and sparse latencies, while verbose tracing can increase log volume.
A developer posts JSON to /v1/query, Express validates the payload, and executeRagQuery allocates a PipelineContext with a fresh traceId. GuardrailProvider scores the raw question, after which QueryExpander asks LLMProvider for a hypothetical answer plus several paraphrases. Each expanded string is embedded through EmbeddingProvider and sent to VectorStore while parallel calls hit SparseSearchProvider; FusionStrategy merges the ranked lists before RerankerProvider optionally reorders candidates.
ContextPacker trims the fused chunks to the configured token budget, FilesystemPromptTemplateStore loads the template referenced by promptTemplateId, and renderPrompt materializes the system message that LLMProvider completes. Another GuardrailProvider pass inspects the answer, EscalationProvider may enqueue a webhook when confidence is low, and the handler responds with text, citations, and pipeline metadata while the traceId ties spans together.
The indexing worker resolves a DataSourceProvider, streams DataSourceDocument records, converts them into RawDocument values, and hands them to indexRawDocuments, which consults IndexStateStore, optionally refreshes sparse mirrors, and records new fingerprints so the next sync skips unchanged files.