Commit 412cdf1
feat: distributed vector search via index segment selection (#24)
Exposes Lance's segment-model APIs through the C ABI so a distributed
query engine (Velox, Presto worker, etc.) can fan a single k-NN query
out across workers, each scanning a slice of the logical index's
physical segments. Tracks
[lance#6309](lance-format/lance#6309).
## Distributed query pattern
```
Coordinator Worker(s)
───────────────── ───────────────
open dataset
list segments ──────── slice ──────────► open same dataset
scanner.nearest(q, k)
scanner.index_segments(my_slice)
return partial top-k stream
heap-merge partial top-k ◄───────────────── (Velox top-k operator handles this)
```
## Summary
**`lance_dataset_index_segment_count(ds, name)`** — number of physical
segments in a logical vector index. Returns 0 + `LANCE_ERR_NOT_FOUND`
for an unknown name.
**`lance_dataset_index_segments(ds, name, out_uuids)`** — fills a
caller-allocated buffer (`count * 16` bytes) with each segment's 16-byte
UUID (RFC 4122).
**`lance_scanner_set_index_segments(scanner, segment_uuids, len)`** —
restricts the next `lance_scanner_nearest()` query to a subset of
segments. `len=0` (any pointer) clears the restriction.
**C++ wrappers**:
- `Dataset::index_segment_count(name)` → `uint64_t`
- `Dataset::index_segments(name)` → `std::vector<std::array<uint8_t,
16>>`
- `Scanner::index_segments(uuids)` (typed vector overload + raw
`uint8_t*` + len overload) — fluent
## Lance dep bump
To get `Scanner::with_index_segments()` (merged in lance #6376) we bump
from crates.io `lance = \"3.0.1\"` to a `git+rev` pin at lance commit
`d630106d` (release tag `v5.0.0-beta.5`). beta-5 keeps arrow on 57.0.0 —
no transitive arrow churn. The `DatasetIndexExt` trait moved from
`lance_index` to `lance::index`; one import path adjusted in
`src/index.rs`.
When lance publishes 5.0.0 stable, the git+rev can be replaced with the
version pin.
## Test plan
- [x] `cargo fmt` clean
- [x] `cargo clippy --all-targets -- -D warnings` clean
- [x] `cargo test` — **75 passed** (70 from main + 5 new)
- [x] `cargo test --test compile_and_run_test -- --ignored` — 2 passed
(C + C++ smoke)
New tests:
- `test_index_segment_count_and_list` — build IVF index, count = 1, list
returns a non-zero UUID.
- `test_index_segment_count_unknown_index` — unknown name → `NotFound`.
- `test_scanner_set_index_segments_with_listed_uuids` — end-to-end k=5
nearest restricted to listed segment UUID, returns 5 results.
- `test_scanner_set_index_segments_unknown_uuid` — bogus UUID is
accepted at setter time, surfaces as an error at scan materialize time
with a message containing "segment".
- `test_scanner_set_index_segments_null_safety` — NULL scanner / NULL
pointer with len>0 / NULL with len=0 (clears).
## Follow-ups (not in this PR)
- Per-segment metadata: today we only expose UUID. A future pass could
add fragment_bitmap / dataset_version / num_indexed_rows so coordinators
can balance work by segment size.
- Distributed build: `commit_existing_index_segments()` and
`merge_existing_index_segments()` exist upstream — they'd let workers
each train one segment and the coordinator commit them atomically.
- Once lance publishes 5.0.0 stable, replace the git+rev pin with a
version pin.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent ad8079a commit 412cdf1
8 files changed
Lines changed: 2215 additions & 622 deletions
File tree
- include/lance
- src
- tests
- cpp
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
18 | 18 | | |
19 | 19 | | |
20 | 20 | | |
21 | | - | |
22 | | - | |
23 | | - | |
24 | | - | |
25 | | - | |
26 | | - | |
27 | | - | |
28 | | - | |
29 | | - | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
35 | 35 | | |
| 36 | + | |
36 | 37 | | |
37 | 38 | | |
38 | | - | |
39 | | - | |
40 | | - | |
41 | | - | |
42 | | - | |
43 | | - | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
44 | 45 | | |
45 | | - | |
46 | | - | |
| 46 | + | |
| 47 | + | |
47 | 48 | | |
48 | 49 | | |
49 | 50 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
707 | 707 | | |
708 | 708 | | |
709 | 709 | | |
| 710 | + | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
| 715 | + | |
| 716 | + | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
| 727 | + | |
| 728 | + | |
| 729 | + | |
| 730 | + | |
| 731 | + | |
| 732 | + | |
| 733 | + | |
| 734 | + | |
| 735 | + | |
| 736 | + | |
| 737 | + | |
| 738 | + | |
| 739 | + | |
| 740 | + | |
| 741 | + | |
| 742 | + | |
| 743 | + | |
| 744 | + | |
710 | 745 | | |
711 | 746 | | |
712 | 747 | | |
| |||
736 | 771 | | |
737 | 772 | | |
738 | 773 | | |
| 774 | + | |
| 775 | + | |
| 776 | + | |
| 777 | + | |
| 778 | + | |
| 779 | + | |
| 780 | + | |
| 781 | + | |
| 782 | + | |
| 783 | + | |
| 784 | + | |
| 785 | + | |
| 786 | + | |
| 787 | + | |
| 788 | + | |
| 789 | + | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
739 | 794 | | |
740 | 795 | | |
741 | 796 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
17 | 17 | | |
18 | 18 | | |
19 | 19 | | |
| 20 | + | |
20 | 21 | | |
21 | 22 | | |
22 | 23 | | |
| |||
533 | 534 | | |
534 | 535 | | |
535 | 536 | | |
| 537 | + | |
| 538 | + | |
| 539 | + | |
| 540 | + | |
| 541 | + | |
| 542 | + | |
| 543 | + | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
| 547 | + | |
| 548 | + | |
| 549 | + | |
| 550 | + | |
| 551 | + | |
| 552 | + | |
| 553 | + | |
| 554 | + | |
| 555 | + | |
| 556 | + | |
| 557 | + | |
| 558 | + | |
| 559 | + | |
| 560 | + | |
| 561 | + | |
536 | 562 | | |
537 | 563 | | |
538 | 564 | | |
| |||
601 | 627 | | |
602 | 628 | | |
603 | 629 | | |
| 630 | + | |
| 631 | + | |
| 632 | + | |
| 633 | + | |
| 634 | + | |
| 635 | + | |
| 636 | + | |
| 637 | + | |
| 638 | + | |
| 639 | + | |
| 640 | + | |
| 641 | + | |
| 642 | + | |
| 643 | + | |
| 644 | + | |
604 | 645 | | |
605 | 646 | | |
606 | 647 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
11 | 12 | | |
| 13 | + | |
12 | 14 | | |
13 | | - | |
14 | 15 | | |
15 | 16 | | |
16 | 17 | | |
| |||
153 | 154 | | |
154 | 155 | | |
155 | 156 | | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
156 | 297 | | |
157 | 298 | | |
158 | 299 | | |
| |||
0 commit comments