Skip to content

Conversation

@tteofili
Copy link
Contributor

@tteofili tteofili commented Nov 3, 2025

this adds PanamaVector support for 2-bit indexed vs 4-bit query asymmetric quantization in DiskBBQ (see #136989), which leads to same recall but much higher QPS / lower latency.

baseline (PanamaVector disabled for 2to4bits asymmetric)

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count     QPS  recall    visited  filter_selectivity
---------------  ----------  -------------------  -----------  ----------------  -------------  ------  ------  ---------  ------------------  
wiki1024en.docs         ivf                 1.00         1.13              3.91           3.46  884.96    0.81   12966.31                1.00
wiki1024en.docs         ivf                 5.00         3.87             12.15           3.14  258.40    0.88   52983.88                1.00
wiki1024en.docs         ivf                10.00         8.42             24.31           2.89  118.76    0.88  102677.22                1.00
wiki1024en.docs         ivf                30.00        21.28             65.92           3.10   46.99    0.88  302296.72                1.00
wiki1024en.docs         ivf                50.00        34.13            107.93           3.16   29.30    0.88  501898.25                1.00
wiki1024en.docs         ivf                70.00        46.41            149.06           3.21   21.55    0.88  701366.05                1.00
wiki1024en.docs         ivf               100.00        65.58            210.46           3.21   15.25    0.88  998000.00                1.00

candidate

index_name       index_type  visit_percentage(%)  latency(ms)  net_cpu_time(ms)  avg_cpu_count      QPS  recall    visited  filter_selectivity
---------------  ----------  -------------------  -----------  ----------------  -------------  -------  ------  ---------  ------------------  
wiki1024en.docs         ivf                 1.00         0.47              1.87           3.98  2127.66    0.82   13224.37                1.00
wiki1024en.docs         ivf                 5.00         0.74              2.56           3.46  1351.35    0.87   52628.29                1.00
wiki1024en.docs         ivf                10.00         1.34              4.50           3.36   746.27    0.88  102770.24                1.00
wiki1024en.docs         ivf                30.00         3.39             11.78           3.47   294.99    0.88  302209.45                1.00
wiki1024en.docs         ivf                50.00         5.54             19.23           3.47   180.51    0.88  501835.49                1.00
wiki1024en.docs         ivf                70.00         8.10             26.84           3.31   123.46    0.89  701308.09                1.00
wiki1024en.docs         ivf               100.00        11.40             38.40           3.37    87.72    0.89  997999.00                1.00

this is on top of some changes in #137510

@tteofili tteofili changed the title DiskBBQ - Panama support for 2 bit index to int4 query asymmetric quantization #137510 DiskBBQ - Panama support for 2 bit index to 4 bit query asymmetric quantization #137510 Nov 3, 2025
@tteofili tteofili marked this pull request as ready for review November 3, 2025 15:40
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Nov 3, 2025
@tteofili tteofili changed the title DiskBBQ - Panama support for 2 bit index to 4 bit query asymmetric quantization #137510 DiskBBQ - Panama support for 2 bit index to 4 bit query asymmetric quantization Nov 3, 2025
Copy link
Contributor

@john-wagster john-wagster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@tteofili tteofili merged commit e5a7abf into elastic:main Nov 3, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants