|
| 1 | +# Overview |
| 2 | + |
| 3 | +Lumina is a C++ library for high-performance vector search and persisted indexes. It provides production-oriented |
| 4 | +backends (DiskANN / IVF / Bruteforce), a narrow API surface, and extension points for advanced workflows. |
| 5 | + |
| 6 | +In addition to the core C++ API, Lumina also provides an experimental Python interface covering index building, search, and other basic workflows. |
| 7 | + |
| 8 | +## Why Lumina? |
| 9 | + |
| 10 | +Lumina is designed as a production-grade search infrastructure component. Every design decision — |
| 11 | +from the API surface to the index format — is made with long-term maintainability and operational reliability in mind. |
| 12 | + |
| 13 | +1. **Mature, deliberate API design** |
| 14 | + A minimal interface with type-safe, exception-safe error handling. All backends share a unified configuration |
| 15 | + system — switching backends is a configuration change, not a code rewrite. |
| 16 | + |
| 17 | +2. **Index format you can trust** |
| 18 | + Persisted indexes follow a versioned format with built-in integrity checks. Upgrades within a compatible |
| 19 | + version range do not require an index rebuild. The same format works across local storage, memory-mapped |
| 20 | + files, and distributed file systems. |
| 21 | + |
| 22 | +3. **Keeps pace with research** |
| 23 | + Core algorithms incorporate results from recent literature — RabitQ quantization, graph pruning |
| 24 | + heuristics, locality-aware disk reordering — and ship as production features, not perpetual experiments. |
| 25 | +4. **Deep C++ engineering foundation** |
| 26 | + Resource ownership is explicit and predictable. Memory allocation is tiered and controllable — critical |
| 27 | + for multi-tenant deployments. The codebase is built on modern C++ standards with strict engineering |
| 28 | + governance: mandatory code review, pre-commit validation, versioned release trains, and a compatibility |
| 29 | + policy that distinguishes stable from experimental surfaces. Every release is a deliberate, tested artifact. |
| 30 | + |
| 31 | +5. **Pluggable IO for any storage topology** |
| 32 | + The IO layer accepts user-supplied readers and writers, decoupling index logic from storage. The same |
| 33 | + index binary can be served from local SSD, object storage, or a distributed file system without changes |
| 34 | + to the core library — enabling storage-compute separation and cloud-native deployments out of the box. |
| 35 | +6. **Typed extension framework** |
| 36 | + Vector search in production demands capabilities beyond pure ANN — filtering, checkpointing, distributed |
| 37 | + builds — yet bundling them all into the core API would bloat the interface and couple unrelated concerns. |
| 38 | + Lumina addresses this with a typed extension layer: each capability attaches to a Builder or Searcher instance |
| 39 | + through a contract that specifies lifecycle ownership, thread-safety semantics, and supported backends. |
| 40 | + Incompatible combinations are rejected at attach time with a clear error, not discovered at query time. |
| 41 | + |
| 42 | + | Extension | Status | |
| 43 | + |-----------|--------| |
| 44 | + | Attribute-based filtered search | stable | |
| 45 | + | Build checkpointing | experimental | |
| 46 | + | Range & discrete-label filtering | planned | |
| 47 | + | Distributed build coordination | planned | |
| 48 | + |
| 49 | +## Backends at a glance |
| 50 | + |
| 51 | +### DiskANN |
| 52 | + |
| 53 | +**Scale**: billions of vectors. **Memory**: sub-linear — graph metadata, quantized codes, and a configurable hot-node cache reside in RAM; full-precision or higher-precision quantized vectors stay on disk. |
| 54 | + |
| 55 | +DiskANN builds a Vamana proximity graph offline, then serves queries through a coroutine-based parallel beam search that issues batched, sector-aligned disk reads without blocking threads on I/O. Key engineering choices: |
| 56 | + |
| 57 | +- **Layout optimization** — After graph construction, a locality-aware reordering pass (BNP/BNF) places neighboring nodes into the same disk sector, reducing random I/O during search. |
| 58 | +- **Two-tier caching** — A static cache (BFS-loaded entry-region nodes) absorbs the first hops; a dynamic LRU cache adapts to workload skew at runtime. |
| 59 | +- **Build-time checkpointing** — Long builds can resume from a saved checkpoint after interruption, avoiding full restarts on billion-scale datasets. |
| 60 | +- **Quantization** — Both in-memory and on-disk vectors support SQ8, PQ, and RabitQ encoding. The disk encoding can differ from the in-memory one, trading a small recall margin for significantly smaller index files. |
| 61 | +- **Tag-aware graph construction** (in progress) — Filtered search with label dimensions is under active development. |
| 62 | + |
| 63 | + |
| 64 | +### IVF |
| 65 | + |
| 66 | +**Scale**: millions to tens of millions of vectors. **Memory**: moderate — centroids and quantized codes reside in RAM. |
| 67 | + |
| 68 | +IVF partitions the vector space into inverted lists via k-means clustering, then searches by probing the nearest lists. Supports SQ8, PQ, and RabitQ quantization to control the memory-accuracy tradeoff. Currently supports L2 distance only; Cosine and InnerProduct are under development. The on-disk snapshot layout is experimental and may change across versions. |
| 69 | + |
| 70 | +### Bruteforce |
| 71 | + |
| 72 | +**Scale**: thousands to low millions of vectors. **Memory**: full dataset in RAM. |
| 73 | + |
| 74 | +Bruteforce computes exact distances against every vector — no approximation, no index structure. Use it as a recall-rate baseline for benchmarking other backends, or in production when the dataset is small enough that linear scan meets latency requirements. |
| 75 | + |
| 76 | +## Use cases |
| 77 | + |
| 78 | +- **Vector database backend** — power billion-scale similarity search behind a database or retrieval service. |
| 79 | +- **Recommendation systems** — real-time recall of similar items or users from high-dimensional embeddings. |
| 80 | +- **Image and video search** — fast matching over visual feature vectors. |
| 81 | +- **RAG** — give an LLM a high-performance knowledge-base retrieval layer. |
| 82 | + |
| 83 | +## Core components |
| 84 | + |
| 85 | +| Component | What it does | |
| 86 | +|-----------|-------------| |
| 87 | +| **API layer** | `LuminaBuilder`, `LuminaSearcher`, `Options`, `Query` — your main integration surface | |
| 88 | +| **Python facade** | Experimental `lumina` package wrapping Builder/Searcher, plus a filtered-search wrapper | |
| 89 | +| **Backends** | DiskANN, IVF, Bruteforce — the concrete index algorithms | |
| 90 | +| **Quantizer** | Vector compression and distance estimation: SQ8, PQ, RabitQ | |
| 91 | +| **IO system** | Binary container format with section management and CRC verification | |
| 92 | +| **Telemetry** | Production logging and metrics hooks | |
| 93 | +| **Extensions** | Typed build-time and search-time extension points: filtered search, checkpointing. Explicit lifecycle and thread-safety contracts | |
| 94 | + |
| 95 | +## Our Publications |
| 96 | + |
| 97 | +Research behind Lumina has been published at top-tier database and systems venues: |
| 98 | + |
| 99 | +- **[SIGMOD'26]** Zhiyuan Hua, Qiji Mo, Zebin Yao, Lixiao Cui, Xiaoguang Liu, Gang Wang, Zijing Wei, Xinyu Liu, Tianxiao Tang, Shaozhi Liu, Lin Qu. *Dynamically Detect and Fix Hardness for Efficient Approximate Nearest Neighbor Search.* ACM Conference on Management of Data, 2026. ([arXiv](https://arxiv.org/abs/2510.22316)) |
| 100 | +- **[ICDE'26]** Qiji Mo, Zhiyuan Hua, Zebin Yao, Lixiao Cui, Xiaoguang Liu, Gang Wang, Zijing Wei, Xinyu Liu, Tianxiao Tang, Shaozhi Liu, Lin Qu. *Overcoming the Sync-Compute Dilemma in Parallel Graph-Based Vector Retrieval.* IEEE International Conference on Data Engineering, 2026. |
| 101 | + |
| 102 | +## Next steps |
| 103 | + |
| 104 | +- [Python quick start](../PythonQuickStart.md) — run the full build → dump → open → search flow in Python. |
| 105 | +- [DiskANN tuning guide](./DiskANNParameters.md) — graph build and search parameter tuning for DiskANN. |
| 106 | +- [Options reference](./OptionsReference.md) — complete list of configuration keys. |
0 commit comments