Skip to content

spcl/fanns-benchmark

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Benchmarking Filtered Approximate Nearest Neighbor Search Algorithms on Transformer-based Embedding Vectors

This repository accompanies the paper "Benchmarking Filtered Approximate Nearest Neighbor Search Algorithms on Transformer-based Embedding Vectors", presenting a comprehensive evaluation of modern FANNS methods under EM, R, and EMIS filtering.

Reproducing Results on a Local Machine

1. Clone the Repository

git clone https://github.com/spcl/fanns-benchmark.git
cd fanns-benchmark

2. Select Dataset Scale

Open experiment_arxiv_dataset.py and edit line 17 to select the desired dataset size:

  • small (1k database items, 10k queries)
  • medium (100k database items, 10k queries)
  • large (~2.7M database items, 10k queries)

3. Start Docker Daemon

sudo systemctl start docker

4. Build Docker Image

sudo docker build -t <name_of_image> .

5. Run the Container

sudo docker run -v $(pwd):/workspace/fanns_benchmark -it <name_of_image>

6. Run Experiments Inside the Container

  • Run all experiments and all algorithms:

    python3 experiment_arxiv_dataset.py
  • Run a specific experiment:

    Possible experiments: arxiv_em, arxiv_r, and arxiv_emis

    python3 experiment_arxiv_dataset.py <experiment>
  • Run a specific experiment with a specific algorithm:

    Possible experiments: arxiv_em, arxiv_r, and arxiv_emis

    arxiv_em supports the algorithms ACORN, CAPS-kmeans, FDANN-stitched, NHQ-kgraph, NHQ-nsw, and UNG

    arxiv_r supports the algorithms ACORN and SeRF

    arxiv_emis supports the algorithms ACORN, FDANN-stitched, and UNG

    python3 experiment_arxiv_dataset.py <experiment> <algorithm>

Each run:

  • Performs a parameter search (unless cached) and logs the search to parameters/parameter_search_log_<experiment>_<algorithm>.jsonl.
  • Caches best parameters in parameters/parameter_search_cache_<experiment>_<algorithm>.jsonl.
  • Stores benchmark results in results/<experiment>_<algorithm>.json.

7. Plot Results

python3 plots.py

This script reads all available result files in the results/ folder and saves visualizations in the plots/ folder.

Reproducing Results on a Compute Cluster

To run experiments on a compute cluster, the setup must be adapted to the specific environment. In general, the Docker image needs to be integrated with the cluster’s container system (e.g., SARUS), and the experiments are executed the same way as in the local setup.

We provide an example job script:
job_experiment_arxiv_dataset.sh
This script is tailored for a SLURM-based cluster using the SARUS container runtime.

Make sure to adjust the script to your cluster, including but not limited to:

  • Partition name (#SBATCH --partition=..., line 23)
  • Node exclusions (#SBATCH --exclude=..., line 24)
  • Code directory (CODEDIR=..., line 31)
  • Cache directory (CACHEDIR=..., line 34)
  • Dataset directory (DATASETDIR=..., line 36)

Each experiment–algorithm pair can be launched individually using:

EXPERIMENT=<experiment> ALGORITHM=<algorithm> ./job_experiment_arxiv_dataset.sh

Refer to Section 6 from the local setup for valid <experiment> and <algorithm> values.

Logs, parameter caches, and results are stored in the same folders (parameters/, results/) as in the local setup. Result plots can be generated in the same way using:

python3 plots.py

Runtime Estimates

  • Medium-scale dataset:
    On a consumer-grade laptop with 4 physical cores (8 threads), running all experiments with all algorithms takes approximately 48 hours sequentially.

  • Large-scale dataset:
    On a compute node with 36 physical cores (72 threads), running the full benchmark sequentially takes roughly 2000 hours.
    This can be parallelized across experiments and algorithms if multiple nodes are available.

Our Results

We provide the results of our experiments on the medium-scale and large-scale datasets, including:

  • Best parameters found:
    parameters_medium/, parameters_large/

  • Benchmark results:
    results_medium/, results_large/

  • Generated plots:
    plots_medium/, plots_large/

Citation

If you find this repository useful, please consider citing our FANNS benchmarking paper:

@misc{iff2025fannsbenchmark,
      title={Benchmarking Filtered Approximate Nearest Neighbor Search Algorithms on Transformer-based Embedding Vectors}, 
      author={Patrick Iff and Paul Bruegger and Marcin Chrapek and Maciej Besta and Torsten Hoefler},
      year={2025},
      eprint={2507.21989},
      archivePrefix={arXiv},
      primaryClass={cs.DB},
      url={https://arxiv.org/abs/2507.21989}, 
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published