-
Notifications
You must be signed in to change notification settings - Fork 25.2k
Enable a doc values sparse index on the timestamp field #121673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Enable a doc values sparse index on the timestamp field #121673
Conversation
final IndexSortConfig indexSortConfig, | ||
final boolean hasDocValues | ||
) { | ||
if (index.isConfigured()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not work in case index: true
explicitly because isSet
is true
but getValue() == getDefaultValue()
. As in the other PR I think the implementation of isConfigured
is not correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of @timestamp
for isConfigured
we have two possibilities:
isConfigured()
istrue
which means the user explicitly set theindex
value totrue
orfalse
. In such case no matter the value ofindex
the sparse doc values index should be disabled, and we would use the inverted index.isConfigured()
isfalse
: this means theindex
parameter is not set by default which means, for LogsDB, under certain conditions, we can use the sparse index instead of the inverted index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should then just check for isSet() ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use isSet
than the other test fails...for instance testFieldTypeWithSkipDocValues_LogsDBMode
@@ -170,6 +185,9 @@ public void doValidate(MappingLookup lookup) { | |||
configuredSettings.remove("meta"); | |||
configuredSettings.remove("format"); | |||
configuredSettings.remove("locale"); | |||
if (isIndexParamExplicitOverrideAllowed()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why I had to add the indexSettings
&& IndexMode.LOGSDB.equals(indexMode) | ||
&& hasDocValues | ||
&& indexSortConfig != null | ||
&& indexSortConfig.hasPrimarySortOnField(DataStreamTimestampFieldMapper.DEFAULT_PATH) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should also allow when timestamp is secondary sort?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I left it like that for the moment because that was not really an issue with the tests I have at the moment. Anyway I will replace that method with something like isSortedOnTimestamp
.
final IndexSortConfig indexSortConfig, | ||
final boolean hasDocValues | ||
) { | ||
if (index.isConfigured()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should then just check for isSet() ?
Can we extend this so that this is also enabled for TSDB?
Can we extend this to also work for when |
I think we can do this. The scope of this PR is enable sparse index in favor for indexed data structures, so that we can analyze the results in elastic/logs nightly benchmark. Note that everything is behind a feature flag. We can do this in a follow ups after we analyzed the nightly benchmark results. For tsdb, we like have a separate effort of checking how using sparse index effects querying (also on dimension fields). |
This PR introduces support for a sparse doc values index for the
@timestamp
field inDateFieldMapper
when specific conditions are met:@timestamp
and mapped as a date field.index: false
).When all the conditions above hold true, we:
@timestamp
field, dropping the inverted index in favor of the sparse doc values index.Some queries might experience slower performance as a result of using a doc values sparse index instead of an inverted index.
Disabling the inverted index on the
@timestamp
field while enabling the sparse doc values index is expected to: