lucene_snapshot: Update to new Lucene 10.3 postings format #128240

ChrisHegarty · 2025-05-21T11:55:51Z

There are a few different things going on in this PR, all of which are required to get the lucene_snspahot branch building again, but the most substantial is the update to the new Lucene 10.3 postings format, Lucene103PostingsFormat.

The Lucene90BlockTreeTermsWriter class is used in the implementation of the 10.1 postings codec in Lucene. With the new 10.3 postings format that class is no longer needed, so it has been moved to a test-only location, in order to support backward compatibility testing.

In Elasticsearch we were using Lucene90BlockTreeTermsWriter (from Lucene) directly, through our copy of the Lucene 9.0 postings format, namely ES812PostingsFormat. So if you look at ES812PostingsFormat , you should that the imports should now show that we're using our own copy.

Additionally, changes are required because of the removal of deprecated methods in IOContext, as well as the override of hints for direct IO.

elasticsearchmachine · 2025-05-21T11:56:15Z

Pinging @elastic/es-search (Team:Search)

server/src/test/java/org/elasticsearch/index/mapper/DateFieldTypeTests.java

server/src/test/java/org/elasticsearch/search/query/QueryPhaseTests.java

server/src/main/java/org/elasticsearch/index/codec/PerFieldFormatSupplier.java

...main/java/org/elasticsearch/index/codec/vectors/es818/DirectIOLucene99FlatVectorsFormat.java

javanna

I left some comments, thanks @ChrisHegarty !

server/src/main/java/module-info.java

server/src/test/java/org/elasticsearch/index/mapper/DateFieldTypeTests.java

server/src/test/java/org/elasticsearch/search/query/QueryPhaseTests.java

javanna · 2025-05-21T13:26:41Z

server/src/test/java/org/elasticsearch/search/query/QueryPhaseTests.java

@@ -817,6 +817,13 @@ public void testNumericSortOptimization() throws Exception {

        Query q = LongPoint.newRangeQuery(fieldNameLong, startLongValue, startLongValue + numDocs);

+        // 0. test assertion - the query rewritten to a match all - https://github.com/apache/lucene/pull/14609/
+        // TODO: reflow total hits expectations


this refers to the other TODO below or is it something else?

server/src/main/java/org/elasticsearch/index/codec/postings/Lucene90BlockTreeTermsWriter.java

server/src/main/java/org/elasticsearch/index/codec/postings/CompressionAlgorithm.java

...main/java/org/elasticsearch/index/codec/vectors/es818/DirectIOLucene99FlatVectorsFormat.java

server/src/main/java/org/elasticsearch/index/codec/PerFieldFormatSupplier.java

thecoop

Test comments can be addressed here, or after this is merged (to get the branch compiling again)

server/src/test/java/org/elasticsearch/index/mapper/DateFieldTypeTests.java

javanna

left a small comment, LGTM though

server/src/main/java/org/elasticsearch/index/codec/postings/Lucene90BlockTreeTermsWriter.java

ChrisHegarty · 2025-05-23T09:04:36Z

@elasticmachine run elasticsearch-ci/part-1

elasticsearchmachine · 2025-05-23T10:07:06Z

Pinging @elastic/es-storage-engine (Team:StorageEngine)

martijnvg

LGTM

server/src/main/java/org/elasticsearch/index/codec/Elasticsearch92Lucene103Codec.java

server/src/main/java/org/elasticsearch/index/codec/PerFieldFormatSupplier.java

ChrisHegarty · 2025-05-23T11:46:28Z

While the CI is not completely green, there are some failures that seem unrelated, I'm going to merge this PR so that we can sync main into the branch also, and unblock further changes.

lucene_snapshot: Update to new Lucene 10.3 postings format

60b9f59

ChrisHegarty added >non-issue :Search/Search Search-related issues that do not fall into other categories Team:Search Meta label for search team labels May 21, 2025

ChrisHegarty commented May 21, 2025

View reviewed changes

server/src/test/java/org/elasticsearch/index/mapper/DateFieldTypeTests.java Outdated Show resolved Hide resolved

ChrisHegarty commented May 21, 2025

View reviewed changes

server/src/test/java/org/elasticsearch/search/query/QueryPhaseTests.java Outdated Show resolved Hide resolved

ChrisHegarty commented May 21, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/index/codec/PerFieldFormatSupplier.java Outdated Show resolved Hide resolved

thecoop reviewed May 21, 2025

View reviewed changes

...main/java/org/elasticsearch/index/codec/vectors/es818/DirectIOLucene99FlatVectorsFormat.java Outdated Show resolved Hide resolved

javanna reviewed May 21, 2025

View reviewed changes

ChrisHegarty added 2 commits May 21, 2025 17:21

itr

4e691db

fix hints

efffda5

thecoop reviewed May 22, 2025

View reviewed changes

...main/java/org/elasticsearch/index/codec/vectors/es818/DirectIOLucene99FlatVectorsFormat.java Outdated Show resolved Hide resolved

ChrisHegarty and others added 7 commits May 22, 2025 12:35

Merge branch 'lucene_snapshot' into new_103_postings_format

288f4f6

fix direct io hints - again

1478c8a

rename codec

351ba80

Always keep DirectIOHint in the hints

737d802

remove compression and constant copied code

12d25c9

Use the new Elasticsearch92Lucene103Codec

4c2c3d9

fix default postings format - aligned with changes in main

467de1b

ChrisHegarty commented May 22, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/index/codec/PerFieldFormatSupplier.java Show resolved Hide resolved

thecoop approved these changes May 22, 2025

View reviewed changes

fix test

e5a37db

ChrisHegarty commented May 22, 2025

View reviewed changes

server/src/test/java/org/elasticsearch/index/mapper/DateFieldTypeTests.java Show resolved Hide resolved

fix test

f1c4a51

javanna approved these changes May 23, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/index/codec/postings/Lucene90BlockTreeTermsWriter.java Outdated Show resolved Hide resolved

comment

afa5b34

martijnvg added the :StorageEngine/Codec label May 23, 2025

elasticsearchmachine added the Team:StorageEngine label May 23, 2025

martijnvg approved these changes May 23, 2025

View reviewed changes

server/src/main/java/org/elasticsearch/index/codec/Elasticsearch92Lucene103Codec.java Outdated Show resolved Hide resolved

server/src/main/java/org/elasticsearch/index/codec/PerFieldFormatSupplier.java Show resolved Hide resolved

ChrisHegarty added 2 commits May 23, 2025 11:26

another test fix

c363959

DEFAULT_POSTINGS_FORMAT

34e94cc

ChrisHegarty merged commit 9b12969 into elastic:lucene_snapshot May 23, 2025
13 of 19 checks passed

ChrisHegarty deleted the new_103_postings_format branch May 23, 2025 11:48

ChrisHegarty added the lucene_10_3_dev Tracking issue that arise during the development of Lucene 10.3 label May 29, 2025

lucene_snapshot: Update to new Lucene 10.3 postings format #128240

lucene_snapshot: Update to new Lucene 10.3 postings format #128240

Uh oh!

Conversation

ChrisHegarty commented May 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented May 21, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

javanna left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

javanna May 21, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

thecoop left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

javanna left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ChrisHegarty commented May 23, 2025

Uh oh!

elasticsearchmachine commented May 23, 2025

Uh oh!

martijnvg left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ChrisHegarty commented May 23, 2025

Uh oh!

Uh oh!

Uh oh!

ChrisHegarty commented May 21, 2025 •

edited

Loading