[Feature] Skip adding flat vector storage during CAGRA to HNSW graph conversion

## Background/Motivation

The OpenSearch k-NN plugin supports building vector indexes using a GPU-accelerated remote index build service. (Reference blog post: https://opensearch.org/blog/GPU-Accelerated-Vector-Search-OpenSearch-New-Frontier/)

The remote index build service performs the following steps:
1. Download vectors from a remote object storage
2. Build the CAGRA graph using GPUs
3. Convert the CAGRA graph to HNSW graph
4. Serialize the HNSW graph, via `faiss.write_index`
5. Upload the HNSW graph to remote object storage 

Then, once the HNSW graph has been uploaded, the k-NN plugin can download this graph and search it, as if it had been built on a CPU. 

One of the main limitations is the amount of CPU memory consumed during step 3. The HNSW object contains both the flat vector storage and the graph structure. The flat vector storage is loaded here: https://github.com/facebookresearch/faiss/blob/9ea026cc93f349eeca25b1246af1e97745df84f7/faiss/gpu/GpuIndexCagra.cu#L480. 

However, the k-NN plugin already has a copy of the vectors before it sends them to the remote index build service. In theory, the flat vector storage does not need to be loaded into the HNSW object; instead, the k-NN plugin can stitch the graph structure together with its copy of the flat vectors at search time. With this approach, the memory consumption on the remote build service is reduced by ~ 50%. Performance also improves, since the HNSW index uploaded to remote object storage is much smaller. 


## Proposal

There may be other use cases besides OpenSearch k-NN GPU acceleration that follow this architecture, where the storage is managed separately from the converted `IndexHNSWCagra` graph. 

Thus, I propose changing the `GpuIndexCagra::copyTo` signature like so:

```
void copyTo(faiss::IndexHNSWCagra* index, bool skip_storage = false) const;
```

where if `skip_storage` = `true`, the code skips adding the vector storage to the index. This is analogous to the  `faiss.write_index` `IO_FLAG_SKIP_STORAGE` flag: https://github.com/facebookresearch/faiss/blob/0d147a78bc60574d6ae03d09e9e04c9b32d5b6c5/faiss/impl/index_write.cpp#L872 

When `copyTo` is called with `skip_storage` = `true`, and then `write_index` is called with `IO_FLAG_SKIP_STORAGE` = `true`, the serialized HNSW graph will not have the flat vector storage. `skip_storage` is set to `false` for backwards compatibility. 

I also propose changing the signature of the `GpuIndexBinaryCagra` function like so:

```
void copyTo(faiss::IndexBinaryHNSWCagra* index, bool skip_storage = false) const;
```

and adding support for `io_flags` in `write_index_binary`:

```
void write_index_binary(const IndexBinary* idx, const char* fname, int io_flags = 0);
void write_index_binary(const IndexBinary* idx, FILE* f, int io_flags = 0);
void write_index_binary(const IndexBinary* idx, IOWriter* writer, int io_flags = 0);
```

Within `copyTo`, I am thinking we could add a check for `skip_storage=true` before this line: https://github.com/facebookresearch/faiss/blob/0d147a78bc60574d6ae03d09e9e04c9b32d5b6c5/faiss/gpu/GpuIndexCagra.cu#L456

And then in `write_index`, we need to make sure the `io_flags` get passed in the recursive call:
https://github.com/facebookresearch/faiss/blob/0d147a78bc60574d6ae03d09e9e04c9b32d5b6c5/faiss/impl/index_write.cpp#L844

For the binary case, the changes would be similar, and we'd also need to add support for reading the `fourcc("null")` and throwing `nullptr`: https://github.com/facebookresearch/faiss/blob/0d147a78bc60574d6ae03d09e9e04c9b32d5b6c5/faiss/impl/index_read.cpp#L2048

I'm looking for some feedback on this, and would love to get a maintainer opinion. If there is alignment on the interface changes and the general approach, then I can go ahead and raise a PR. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Skip adding flat vector storage during CAGRA to HNSW graph conversion #4931

Background/Motivation

Proposal

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature] Skip adding flat vector storage during CAGRA to HNSW graph conversion #4931

Description

Background/Motivation

Proposal

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions