Skip to content

Conversation

@nik9000
Copy link
Member

@nik9000 nik9000 commented Oct 31, 2025

Adds special purpose BlockLoader implementations for the MV_MIN and MV_MAX functions for keyword fields with doc values. These are a noop for single valued keywords but should be much faster for multivalued keywords.

These aren't plugged in yet. We can plug them in and performance test them in #137382. And they give us two more functions we can use to demonstrate #137382.

Adds special purpose `BlockLoader` implementations for the `MV_MIN` and
`MV_MAX` functions for `keyword` fields with doc values. These are a
noop for single valued keywords but should be *much* faster for
multivalued keywords.

These aren't plugged in yet. We can plug them in and performance test
them in elastic#137382. And they give us two more functions we can use to
demonstrate elastic#137382.
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Oct 31, 2025
@nik9000 nik9000 requested a review from dnhatn October 31, 2025 17:45
Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thank you, Nik!

if (docs.count() - offset == 1) {
return readSingleDoc(factory, docs.get(offset));
}
try (var builder = factory.sortedSetOrdinalsBuilder(ordinals, docs.count() - offset)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use singletonOrdinalsBuilder since each row should have at most one value. However, we are not using it because it requires SortedDocValues. This is not an issue here, but if this pattern recurs, we might consider relaxing singletonOrdinalsBuilder to accept an interface that supports looking up terms by ordinal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


@Override
protected AllReader sortedSetReader(SortedSetDocValues docValues) {
return new MvMaxSortedSet(docValues);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered wrapping SortedSetDocValues as SortedDocValues and then calling Singleton, but I think the approach in this PR is safer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a little more paranoid, yeah.

/**
* Loads {@code keyword} style fields that are stored as a lookup table.
*/
abstract class AbstractBytesRefsFromOrdsBlockLoader extends BlockDocValuesReader.DocValuesBlockLoader {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!


@Override
protected AllReader sortedReader(SortedNumericDocValues docValues) {
return new MvMinSorted(docValues);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: MvMaxSorted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

return "IntsFromDocValues[" + fieldName + "]";
}

public static class MvMinSorted extends BlockDocValuesReader {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: MvMaxSorted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@nik9000 nik9000 enabled auto-merge (squash) November 3, 2025 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants