-
Notifications
You must be signed in to change notification settings - Fork 169
Integrating Lucene's Better Binary Quantization #2838
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature/lucene-bbq
Are you sure you want to change the base?
Integrating Lucene's Better Binary Quantization #2838
Conversation
@adityamachiroutu as mentioned offline, can you double check the oversampling factor of Faiss BQ with a debugger(check the value of this firstPassK). I believe it's 5x not 3x if dimension of vector is less than 1000 - https://github.com/opensearch-project/k-NN/blob/main/src/main/java/org/opensearch/knn/index/mapper/CompressionLevel.java#L119 |
@naveentatikonda it's 5 for dimension < 1000 and 3 for >= 1000 as you mentioned. I checked with debugger last month. We also have a unit test confirming this: https://github.com/opensearch-project/k-NN/blob/b7fc5dd98072ea157a9b301e7f93d79e97[…]java/org/opensearch/knn/index/mapper/CompressionLevelTests.java |
src/main/java/org/opensearch/knn/index/codec/BasePerFieldKnnVectorsFormat.java
Outdated
Show resolved
Hide resolved
4094073
to
6471a07
Compare
…which calls the lucene codec that implements better binary quantization Signed-off-by: Aditya Machiroutu <[email protected]>
…tless Signed-off-by: Aditya Machiroutu <[email protected]>
Signed-off-by: Aditya Machiroutu <[email protected]>
Signed-off-by: Aditya Machiroutu <[email protected]>
Signed-off-by: Aditya Machiroutu <[email protected]>
Signed-off-by: Aditya Machiroutu <[email protected]>
Signed-off-by: Aditya Machiroutu <[email protected]>
Signed-off-by: Aditya Machiroutu <[email protected]>
Signed-off-by: Aditya Machiroutu <[email protected]>
Signed-off-by: Aditya Machiroutu <[email protected]>
e9ee3b5
to
b814e1f
Compare
Signed-off-by: Aditya Machiroutu <[email protected]>
b814e1f
to
6812655
Compare
Signed-off-by: Aditya Machiroutu <[email protected]>
Signed-off-by: Aditya Machiroutu <[email protected]>
Signed-off-by: Aditya Machiroutu <[email protected]>
Signed-off-by: Aditya Machiroutu <[email protected]>
988d959
to
f8aea0c
Compare
f8aea0c
to
a8a31c7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few small comments and questions, thanks Aditya!
CHANGELOG.md
Outdated
## [Unreleased 3.3](https://github.com/opensearch-project/k-NN/compare/main...HEAD) | ||
|
||
### Features | ||
* Integrated Lucene's better binary quantization [#2838](https://github.com/opensearch-project/k-NN/pull/2838) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: present tense
} | ||
} | ||
|
||
@AwaitsFix(bugUrl = "https://github.com/opensearch-project/k-NN/issues/2805") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this meant to be here?
if (engine == KNNEngine.LUCENE) { | ||
if (params != null && params.containsKey(METHOD_ENCODER_PARAMETER)) { | ||
KNNBBQVectorsFormatParams bbqParams = new KNNBBQVectorsFormatParams(params, defaultMaxConnections, defaultBeamWidth); | ||
if (bbqParams.validate(params)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please add debug log like the below?
), | ||
knnBBQVectorsFormatParams -> new Lucene102HnswBinaryQuantizedVectorsFormat( | ||
knnBBQVectorsFormatParams.getMaxConnections(), | ||
knnBBQVectorsFormatParams.getBeamWidth(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there no constructor parameter for bits like above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, Lucene does not have that constructor.
* Check if BBQ is enabled | ||
* @return true if BBQ is enabled, false otherwise | ||
*/ | ||
public boolean isBBQEnabled() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of this method?
MethodComponentContext methodComponentContext = resolvedKNNMethodContext.getMethodComponentContext(); | ||
MethodComponentContext encoderComponentContext = new MethodComponentContext(SQ_ENCODER.getName(), new HashMap<>()); | ||
|
||
String encoderName = (resolvedCompressionLevel == CompressionLevel.x32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: since there are multiple clauses let's make this one if check of x32 compression
* is invalid. | ||
*/ | ||
public RescoreContext getDefaultRescoreContext(Mode mode, int dimension, Version version) { | ||
// TODO move this to separate class called resolver to resolve rescore context |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does your PR address this TODO? If not can we please leave it in?
); | ||
|
||
// Invalid compression | ||
expectThrows( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's keep this in and change to x16 to still throw with a comment explaining that x32 added in this PR
Signed-off-by: Aditya Machiroutu <[email protected]>
@adityamachiroutu during your tests on different datasets was the rescoring added? or this is just bare bone testing of the quantization techniques? |
Yes, during my tests, rescoring was added. For consistency, the same oversampling factor (5x for vectors with <1000 dimensions, 3x for vectors > 1000 dimensions) was kept for Faiss and Lucene runs. |
KNNBBQVectorsFormatParams bbqParams = new KNNBBQVectorsFormatParams(params, defaultMaxConnections, defaultBeamWidth); | ||
if (bbqParams.validate(params)) { | ||
log.debug( | ||
"Initialize KNN vector format for field [{}] with params [{}] = \"{}\", [{}] = \"{}\"", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the log statement have BBQ in it? Idea is for the log to distinguish whether BBQ or ScalarQuantized format was instantiated.
return false; | ||
} | ||
|
||
if (!(params.get(METHOD_ENCODER_PARAMETER) instanceof MethodComponentContext)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use == false
check. It's standard across OpenSearch
if ((params.get(METHOD_ENCODER_PARAMETER) instanceof MethodComponentContext) == false)
// Add new method signature with KNNEngine parameter | ||
public RescoreContext getDefaultRescoreContext(Mode mode, int dimension, Version version, KNNEngine engine) { | ||
// TODO move this to separate class called resolver to resolve rescore context | ||
if (modesForRescore.contains(mode)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would also need a version check similar to here - it could be a restore scenario across versions and the encoder might not be supported, unless we backport it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for reviewing, addressed your comments.
Signed-off-by: Aditya Machiroutu <[email protected]>
qa/restart-upgrade/build.gradle
Outdated
knn_bwc_version.startsWith("2.15.")) { | ||
filter { | ||
excludeTestsMatching "org.opensearch.knn.bwc.IndexingIT.testKNNIndexLuceneQuantization" | ||
excludeTestsMatching "org.opensearch.knn.bwc.IndexingIT.testKNNIndexLuceneBBQ" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you trying to exclude this test only until version 2.15 ? it won't work until version 3.2 ?
private static final int NUM_DOCS = 10; | ||
private static int QUERY_COUNT = 0; | ||
|
||
private static final String ALGO = "hnsw"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this constant
} | ||
|
||
public void testKNNIndexLuceneBBQ() throws Exception { | ||
waitForClusterHealthGreen(NODES_BWC_CLUSTER); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to add the same condition to validate BWC version twice in if and else blocks, probably add it here after the cluster is green
if (!isBBQEncoderSupported(getBWCVersion())) {
logger.info("Skipping testKNNIndexLuceneBBQ as BBQ encoder is not supported in version: {}", getBWCVersion());
return;
}
private static final Set<VectorDataType> SUPPORTED_DATA_TYPES = ImmutableSet.of(VectorDataType.FLOAT); | ||
|
||
private final static MethodComponent METHOD_COMPONENT = MethodComponent.Builder.builder(ENCODER_BBQ) | ||
.addSupportedDataTypes(SUPPORTED_DATA_TYPES) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline in the past, can you add support for bits parameter and set default to 1 bit (32x compression) such that in the future if Lucene supports 2 and 4 bits we can use this parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the naming convention, pls keep it consistent with Faiss BQ
@adityamachiroutu can you also share the benchmarking results that you have without rescoring? Also, pls share your findings once you identify the hotspots for the reason behind this huge spike in latencies especially with cohere datasets (4x to 12x higher when compared to Faiss). Thanks! |
@adityamachiroutu can you add the configuration used for running these tests specifically m, ef_construction ? |
Signed-off-by: Aditya Machiroutu <[email protected]>
3994def
to
e3bc2dc
Compare
RescoreContext getDefaultRescoreContext(Mode mode, int dimension) { | ||
return getDefaultRescoreContext(mode, dimension, Version.CURRENT); | ||
// Special handling for Lucene BBQ (x32 compression) | ||
if (this == x32 && engine == KNNEngine.LUCENE && version.onOrAfter(Version.V_3_2_0)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be after V_3_3_0
, since this was not added as a part of 3.2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, version.onOrAfter(Version.V_3_3_0)
Signed-off-by: Aditya Machiroutu <[email protected]>
f2ad1d6
to
ca8cb09
Compare
Description
Integrates Lucene’s Better Binary Quantization (BBQ) into the OpenSearch k-NN plugin, enabling a memory-efficient encoding option for high-dimensional vector search. Compared to Faiss binary quantization, BBQ has increased recall, while maintaining low memory usage and integrating well into existing query and rescoring pipelines.
This integration exposes BBQ through the Lucene engine with a new "encoder": "binary" parameter in index mappings, leveraging Lucene’s native vector quantization and storage formats. BBQ improves recall on large datasets with a modest trade-off in latency and throughput, as demonstrated in the benchmarks below. Users can adjust the oversample factor in queries to balance recall and performance. By default the oversample factor set is 3x for vectors with 1000+ dimensions, and 5x otherwise. This matches the FAISS 32x compression defaults set.
Users can additionally configure the BBQ encoder by setting their engine as Lucene and the compression level to 32x, from which BBQ is automatically enabled.
Benchmarking Results:
No Rescoring Tests Comparing FAISS BQ to Lucene BBQ
Config:
1 shard, 1 segments, 10 indexing clients, k=100, ef_search: 100, ef_construction: 100
Single Node Tests Comparing FAISS BQ (on disk) to Lucene BBQ
Config: ef_search: 256, ef_construction: 256, m: 16, 1 shard, 1 client, 1 segment
Related Issues
Resolves #2805
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.