Description
Currently there is a feature flag that controls whether doc value skippers are enabled on host.name
and @timestamp
fields in case of logsdb index mode and _tsid
and @timestamp
fields in case of times series index mode.
Initial benchmark results showed that sometimes filtering on @timestamp
field became significantly slower (upto 3 times). This is because the bkd tree (points) was swapped for doc values skipper and the default query logic doesn't always perform well if timestamp field is secondary index sort field. The feature flag has been temporarily disabled.
This issue is about figuring out how to improve filtering by timestamp when doc value skippers are enabled. The performance of filtering by timestamp will likely not be the same as when the bkd tree is enabled. There maybe ways to mitigate some of the performance drop.
Note that replacing the bkd tree with doc value skippers is a trade off. By not storing the bkd tree, we reduce storage and indecing footprint in favor for slower timestamp filtering.
Tasks:
- Investigate special query logic for filtering by
@timestamp
for logsdb and time_series index modes. (when@timestamp
field is always the secondary index sort). (Re-enable use_doc_values_skipper and add specialized lucene query for@timestamp
field filtering. #127260) - Build specialized Lucene range query for filtering by timestamp field, that is aware that
@timestamp
field is always secondary index sort field. SeeTimestampQuery
. - Build specialized field comparator that makes use of doc value skippers. See
TimestampComparator
. This is required for queries that sort by timestamp field. - Make
date_histogram
aggregation (link) and filter-by-filter optimization (link) aware of doc value skippers. - ...
- Enable feature flag by default.
- Remove feature flag.