Skip to content

Opening of vector files with ReadAdvice.RANDOM_PRELOAD #14348

Open
@viliam-durina

Description

@viliam-durina

Description

Vector similarity search using HNSW accesses the vectors very heavily during the search (the vec or veq files). Even more than the HNSW graph itself (the vex file). If the vector files don't fit into the page cache, the performance is reduced very significantly (around 100x in our particular case). Users typically configure their search servers to have enough RAM to fit these files.

Lucene currently uses ReadAdvice.RANDOM when opening these files. I think it would be better to use RANDOM_PRELOAD.

If you agree, I can provide a PR.

Version and environment details

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions