Skip to content

Commit 16a01af

Browse files
1yefuwang1claudeCopilot
authored
Support float16 quantization and other quantization performance improvements (#41)
* Add quantization benchmarks and optimize HalfFloatToF32 with 2x loop unroll Add benchmark suites for QuantizeF32ToF16, QuantizeF32ToBF16, F16ToF32, and BF16ToF32. Unroll the HalfFloatToF32 main loop to process 2*NF elements per iteration, improving instruction-level parallelism and yielding a 1.6-2x speedup on F16/BF16 to F32 dequantization. Co-Authored-By: Claude <noreply@anthropic.com> * Optimize F32 Normalize with 4x loop unroll and fix benchmarks Replace hn::Transform with a manually 4x-unrolled multiply loop in F32 NormalizeImpl for ~1.3x speedup at dim >= 512. Fix BF16 InnerProduct benchmark that was incorrectly measuring F32 overload. Add ClobberMemory to normalize benchmarks. Co-Authored-By: Claude <noreply@anthropic.com> * Add CLAUDE.md * Guard F16 SIMD specializations with #if !HWY_HAVE_FLOAT16 (#40) * Initial plan * Add float16 quantization support: SIMD ops, distance spaces, vector types, quantization, virtual table support, tests, and benchmarks Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> * Fix test bug: use correct loop variable j instead of i in Normalize_F32ToF16 test Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> * Final: float16 quantization support complete Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> * Remove CodeQL artifact from tracking and add to gitignore Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> * Remove CodeQL artifact symlink from repository Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> * Guard F16-specific SIMD ops with #if !HWY_HAVE_FLOAT16 to use native ops when available Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> * Guard F16-specific SIMD ops with #if !HWY_HAVE_FLOAT16 to use native ops when available Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> * Remove CodeQL artifact from tracking Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> * Fix Float16 vector query execution (#42) * Initial plan * Add Float16 vector type handling in query executor Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com> --------- Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: 1yefuwang1 <18023393+1yefuwang1@users.noreply.github.com>
1 parent 4c35449 commit 16a01af

File tree

18 files changed

+652
-35
lines changed

18 files changed

+652
-35
lines changed

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,4 +44,4 @@ dist/*
4444
*egg-info
4545

4646
node_modules
47-
bindings/nodejs/vectorlite/package-lock.json
47+
bindings/nodejs/vectorlite/package-lock.json_codeql_detected_source_root

CLAUDE.md

Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
# CLAUDE.md
2+
3+
## Project Overview
4+
5+
Vectorlite is a SQLite extension for fast vector search using the HNSW algorithm. Written in C++17 with SIMD acceleration via Google Highway. Distributed as Python wheels and npm packages.
6+
7+
## Build Commands
8+
9+
```bash
10+
# Debug build + tests
11+
sh build.sh
12+
# Equivalent to: cmake --preset dev && cmake --build build/dev -j8 && ctest --test-dir build/dev/vectorlite --output-on-failure && pytest bindings/python/vectorlite_py/test
13+
14+
# Release build + tests
15+
sh build_release.sh
16+
17+
# Configure only
18+
cmake --preset dev # debug
19+
cmake --preset release # release
20+
21+
# Build only
22+
cmake --build build/dev -j8
23+
cmake --build build/release -j8
24+
25+
# C++ unit tests only
26+
ctest --test-dir build/dev/vectorlite --output-on-failure
27+
28+
# Python integration tests only
29+
pytest bindings/python/vectorlite_py/test
30+
```
31+
32+
## Project Structure
33+
34+
- `vectorlite/` — Core C++ source (extension entry point, virtual table, vector types, distance functions)
35+
- `vectorlite/ops/` — SIMD operations using Google Highway (distance calculations, quantization)
36+
- `bindings/python/` — Python package wrapping the compiled extension
37+
- `bindings/nodejs/` — Node.js bindings
38+
- `benchmark/` — Performance benchmarks (Python + C++)
39+
- `examples/` — Python usage examples
40+
- `cmake/`, `vcpkg/` — Build infrastructure and dependency management
41+
42+
## Key Dependencies
43+
44+
- **abseil** — Status/StatusOr error handling, string utilities
45+
- **hnswlib** — HNSW index implementation
46+
- **highway** — SIMD abstraction (dynamic dispatch across CPU targets)
47+
- **rapidjson** — JSON parsing for vector serialization
48+
- **sqlite3** — SQLite API
49+
- **gtest** / **benchmark** — Testing and benchmarking frameworks
50+
- **re2** — Regex for input validation
51+
52+
## Coding Conventions
53+
54+
- **Style**: Google C++ Style Guide (enforced by `.clang-format`)
55+
- **C++ standard**: C++17
56+
- **Header guards**: `#pragma once` (no `#ifndef` guards)
57+
- **Naming**:
58+
- Classes/structs: `PascalCase` (`VirtualTable`, `GenericVector`)
59+
- Public functions/methods: `PascalCase` (`Distance()`, `FromJSON()`)
60+
- Member variables: `snake_case_` with trailing underscore (`data_`, `index_`)
61+
- Local variables: `snake_case`
62+
- Macros/constants: `SCREAMING_SNAKE_CASE` with `VECTORLITE_` prefix
63+
- Files: `snake_case.h` / `snake_case.cpp`
64+
- Type aliases: short names (`Vector`, `BF16Vector`, `VectorView`)
65+
- **Error handling**: `absl::Status` / `absl::StatusOr<T>` (not exceptions)
66+
- **Assertions**: `VECTORLITE_ASSERT()` macro for preconditions
67+
- **Memory**: `std::unique_ptr` for ownership; avoid raw owning pointers
68+
- **Optionals**: `std::optional<T>` with `std::nullopt`
69+
- **Namespace**: `vectorlite` (nested `vectorlite::ops` for SIMD ops)
70+
- **Namespace closing**: `} // namespace vectorlite`
71+
72+
## SIMD / Highway Patterns
73+
74+
- SIMD code lives in `vectorlite/ops/` using Highway's dynamic dispatch
75+
- All Highway ops prefixed with `hn::` (e.g., `hn::Load`, `hn::Mul`)
76+
- Use `HWY_NAMESPACE` for target-specific code blocks
77+
- Alias: `namespace hn = hwy::HWY_NAMESPACE`
78+
79+
## Testing
80+
81+
- **C++ unit tests**: Google Test, files named `*_test.cpp` in `vectorlite/`
82+
- **C++ benchmarks**: Google Benchmark in `vectorlite/ops/ops_benchmark.cpp`
83+
- **Python integration tests**: pytest in `bindings/python/vectorlite_py/test/`
84+
- Always run both C++ and Python tests after changes: `sh build.sh`

bindings/python/vectorlite_py/test/vectorlite_test.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,7 @@ def remove_quote(s: str):
121121
file_path = os.path.join(tempdir, 'index.bin')
122122
file_paths = [f'\"{file_path}\"', f'\'{file_path}\'']
123123

124-
for vector_type in ['float32', 'bfloat16']:
124+
for vector_type in ['float32', 'bfloat16', 'float16']:
125125
for index_file_path in file_paths:
126126
assert not os.path.exists(remove_quote(index_file_path))
127127

vectorlite/constraint.cpp

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -221,6 +221,20 @@ absl::StatusOr<QueryExecutor::QueryResult> QueryExecutor::Execute() const {
221221
VECTORLITE_ASSERT(space_.normalize);
222222
BF16Vector normalized_vector = quantized_vector.Normalize();
223223

224+
auto result = index_.searchKnnCloserFirst(
225+
normalized_vector.data().data(), knn_param->k, rowid_filter.get());
226+
return result;
227+
} else if (space_.vector_type == VectorType::Float16) {
228+
F16Vector quantized_vector = QuantizeToF16(knn_param->query_vector);
229+
230+
if (!space_.normalize) {
231+
return index_.searchKnnCloserFirst(quantized_vector.data().data(),
232+
knn_param->k, rowid_filter.get());
233+
}
234+
235+
VECTORLITE_ASSERT(space_.normalize);
236+
F16Vector normalized_vector = quantized_vector.Normalize();
237+
224238
auto result = index_.searchKnnCloserFirst(
225239
normalized_vector.data().data(), knn_param->k, rowid_filter.get());
226240
return result;

vectorlite/distance.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ class GenericInnerProductSpace : public hnswlib::SpaceInterface<float> {
3737

3838
using InnerProductSpace = GenericInnerProductSpace<float>;
3939
using InnerProductSpaceBF16 = GenericInnerProductSpace<hwy::bfloat16_t>;
40+
using InnerProductSpaceF16 = GenericInnerProductSpace<hwy::float16_t>;
4041

4142
template <class T, VECTORLITE_IF_FLOAT_SUPPORTED(T)>
4243
class GenericL2Space : public hnswlib::SpaceInterface<float> {
@@ -64,5 +65,6 @@ class GenericL2Space : public hnswlib::SpaceInterface<float> {
6465

6566
using L2Space = GenericL2Space<float>;
6667
using L2SpaceBF16 = GenericL2Space<hwy::bfloat16_t>;
68+
using L2SpaceF16 = GenericL2Space<hwy::float16_t>;
6769

6870
} // namespace vectorlite

vectorlite/macros.h

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,10 @@
1818

1919
#define VECTORLITE_IF_FLOAT_SUPPORTED(T) \
2020
std::enable_if_t<std::is_same_v<T, float> || \
21-
std::is_same_v<T, hwy::bfloat16_t>>* = nullptr
21+
std::is_same_v<T, hwy::bfloat16_t> || \
22+
std::is_same_v<T, hwy::float16_t>>* = nullptr
2223

2324
#define VECTORLITE_IF_FLOAT_SUPPORTED_FWD_DECL(T) \
2425
std::enable_if_t<std::is_same_v<T, float> || \
25-
std::is_same_v<T, hwy::bfloat16_t>>*
26+
std::is_same_v<T, hwy::bfloat16_t> || \
27+
std::is_same_v<T, hwy::float16_t>>*

0 commit comments

Comments
 (0)