|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +## Project Overview |
| 4 | + |
| 5 | +Vectorlite is a SQLite extension for fast vector search using the HNSW algorithm. Written in C++17 with SIMD acceleration via Google Highway. Distributed as Python wheels and npm packages. |
| 6 | + |
| 7 | +## Build Commands |
| 8 | + |
| 9 | +```bash |
| 10 | +# Debug build + tests |
| 11 | +sh build.sh |
| 12 | +# Equivalent to: cmake --preset dev && cmake --build build/dev -j8 && ctest --test-dir build/dev/vectorlite --output-on-failure && pytest bindings/python/vectorlite_py/test |
| 13 | + |
| 14 | +# Release build + tests |
| 15 | +sh build_release.sh |
| 16 | + |
| 17 | +# Configure only |
| 18 | +cmake --preset dev # debug |
| 19 | +cmake --preset release # release |
| 20 | + |
| 21 | +# Build only |
| 22 | +cmake --build build/dev -j8 |
| 23 | +cmake --build build/release -j8 |
| 24 | + |
| 25 | +# C++ unit tests only |
| 26 | +ctest --test-dir build/dev/vectorlite --output-on-failure |
| 27 | + |
| 28 | +# Python integration tests only |
| 29 | +pytest bindings/python/vectorlite_py/test |
| 30 | +``` |
| 31 | + |
| 32 | +## Project Structure |
| 33 | + |
| 34 | +- `vectorlite/` — Core C++ source (extension entry point, virtual table, vector types, distance functions) |
| 35 | +- `vectorlite/ops/` — SIMD operations using Google Highway (distance calculations, quantization) |
| 36 | +- `bindings/python/` — Python package wrapping the compiled extension |
| 37 | +- `bindings/nodejs/` — Node.js bindings |
| 38 | +- `benchmark/` — Performance benchmarks (Python + C++) |
| 39 | +- `examples/` — Python usage examples |
| 40 | +- `cmake/`, `vcpkg/` — Build infrastructure and dependency management |
| 41 | + |
| 42 | +## Key Dependencies |
| 43 | + |
| 44 | +- **abseil** — Status/StatusOr error handling, string utilities |
| 45 | +- **hnswlib** — HNSW index implementation |
| 46 | +- **highway** — SIMD abstraction (dynamic dispatch across CPU targets) |
| 47 | +- **rapidjson** — JSON parsing for vector serialization |
| 48 | +- **sqlite3** — SQLite API |
| 49 | +- **gtest** / **benchmark** — Testing and benchmarking frameworks |
| 50 | +- **re2** — Regex for input validation |
| 51 | + |
| 52 | +## Coding Conventions |
| 53 | + |
| 54 | +- **Style**: Google C++ Style Guide (enforced by `.clang-format`) |
| 55 | +- **C++ standard**: C++17 |
| 56 | +- **Header guards**: `#pragma once` (no `#ifndef` guards) |
| 57 | +- **Naming**: |
| 58 | + - Classes/structs: `PascalCase` (`VirtualTable`, `GenericVector`) |
| 59 | + - Public functions/methods: `PascalCase` (`Distance()`, `FromJSON()`) |
| 60 | + - Member variables: `snake_case_` with trailing underscore (`data_`, `index_`) |
| 61 | + - Local variables: `snake_case` |
| 62 | + - Macros/constants: `SCREAMING_SNAKE_CASE` with `VECTORLITE_` prefix |
| 63 | + - Files: `snake_case.h` / `snake_case.cpp` |
| 64 | + - Type aliases: short names (`Vector`, `BF16Vector`, `VectorView`) |
| 65 | +- **Error handling**: `absl::Status` / `absl::StatusOr<T>` (not exceptions) |
| 66 | +- **Assertions**: `VECTORLITE_ASSERT()` macro for preconditions |
| 67 | +- **Memory**: `std::unique_ptr` for ownership; avoid raw owning pointers |
| 68 | +- **Optionals**: `std::optional<T>` with `std::nullopt` |
| 69 | +- **Namespace**: `vectorlite` (nested `vectorlite::ops` for SIMD ops) |
| 70 | +- **Namespace closing**: `} // namespace vectorlite` |
| 71 | + |
| 72 | +## SIMD / Highway Patterns |
| 73 | + |
| 74 | +- SIMD code lives in `vectorlite/ops/` using Highway's dynamic dispatch |
| 75 | +- All Highway ops prefixed with `hn::` (e.g., `hn::Load`, `hn::Mul`) |
| 76 | +- Use `HWY_NAMESPACE` for target-specific code blocks |
| 77 | +- Alias: `namespace hn = hwy::HWY_NAMESPACE` |
| 78 | + |
| 79 | +## Testing |
| 80 | + |
| 81 | +- **C++ unit tests**: Google Test, files named `*_test.cpp` in `vectorlite/` |
| 82 | +- **C++ benchmarks**: Google Benchmark in `vectorlite/ops/ops_benchmark.cpp` |
| 83 | +- **Python integration tests**: pytest in `bindings/python/vectorlite_py/test/` |
| 84 | +- Always run both C++ and Python tests after changes: `sh build.sh` |
0 commit comments