Optimise memory usage for vector search

Meta issue for all things related to optimising memory usage for vector search (and in particular quantization and HNSW), with the best defaults for the best out-of-the-box experience.

At a high-level, HNSW works best when "everything" is in memory. Quantization is great in that it significantly reduces the size of the compressed vectors. When using HNSW with quantization we want to offer the best out of the box experience, so align the implementation to make best use of memory. 

The general idea is to use the page cache as efficiently as possible, and avoid unnecessarily causing critical data structures from being paged out. Specifically, 1) pre load the HNSW graph (since not having it in memory results in very poor performance) and offer insights into its residency, and 2) avoid perturbing the page cache when rescoring.

#### Tasks 

- [x] Use Direct I/O for rescoring in BBQ
- [ ] Pre load the HNSW graph and quantised vectors for BBQ
- [ ]  #125681
- [ ] Enhance the API to up level the pre loading, from file extensions to, say, dense vector index type 
- [ ] Expose low-level statistics to allow for scale up / down events.

The focus of this issue is towards solving for the general case of HNSW and quantisation, but we should keep the BBQ use case top of mind and not make it any more complex than needs to be.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimise memory usage for vector search #125655

Tasks

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Optimise memory usage for vector search #125655

Description

Tasks

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions