Skip to content

Optimise memory usage for vector search #125655

Open
@ChrisHegarty

Description

@ChrisHegarty

Meta issue for all things related to optimising memory usage for vector search (and in particular quantization and HNSW), with the best defaults for the best out-of-the-box experience.

At a high-level, HNSW works best when "everything" is in memory. Quantization is great in that it significantly reduces the size of the compressed vectors. When using HNSW with quantization we want to offer the best out of the box experience, so align the implementation to make best use of memory.

The general idea is to use the page cache as efficiently as possible, and avoid unnecessarily causing critical data structures from being paged out. Specifically, 1) pre load the HNSW graph (since not having it in memory results in very poor performance) and offer insights into its residency, and 2) avoid perturbing the page cache when rescoring.

Tasks

The focus of this issue is towards solving for the general case of HNSW and quantisation, but we should keep the BBQ use case top of mind and not make it any more complex than needs to be.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions