Skip to content

Blogpost for star-tree index #3735

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
---
layout: post
title: "Super Fast Aggregations in OpenSearch with Star tree index"
authors:
- bharath-techie
- sandeshkr419
- backslasht
date: 2025-04-10
has_science_table: true
categories:
- technical-posts
meta_keywords: star tree,indexing,aggregations,materialized views,startree
meta_description: Learn how star-tree index improves the performance of aggregations in OpenSearch.
permalink: TODO
---
### Background
Aggregations in OpenSearch can help summarize and analyse the data in real time and are also used widely in OpenSearch Dashboards to create charts and visualization dashboards.

There are different types of aggregations such as :

* Metric aggregations - Calculate metrics such as sum, min, max, avg etc on numeric fields.

Check warning on line 21 in _posts/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.LatinismsElimination] Using 'etc' is unnecessary. Remove. Raw Output: {"message": "[OpenSearch.LatinismsElimination] Using 'etc' is unnecessary. Remove.", "location": {"path": "_posts/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index.md", "range": {"start": {"line": 21, "column": 70}}}, "severity": "WARNING"}
* Bucket aggregations - Sort query results into groups based on some criteria. Eg: Date histograms

Aggregations present following performance challenges compared to standard queries:

* Scalability: Query latency directly depends on the number of matching documents
* Resource consumption: Consume more CPU cycles, memory and disk utilization

For example, in application logs:

* Aggregations on success status codes (high document count) have significantly higher latency
* Aggregations on failure status codes (low document count) gets processed much faster

| Query | No. of documents | Latency |
|-------|------------------|----------|
| Metric aggregations for status = 200 | 200 Mil documents | 4.2 seconds |
| Metric aggregations for status = 400 | 3000 documents | 5 milliseconds |

The latency can be reduced to constant time milliseconds with star-tree index, an experimental feature introduced in OpenSearch 2.18

### Star-tree index
Star-tree index (STIX) inspired from Apache Pinot’s star-tree index is the first multi-field index that improves the performance of aggregations in OpenSearch. STIX pre-computes the aggregations for the configured metrics for all combinations of the configured dimensions during indexing time.
![High Level Architecture](/assets/media/blog-images/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index/star-tree-structure.png)
STIX consists of star-tree which stores the unique values of the dimensions in the nodes and columnar doc_values data which stores the aggregations of configured metrics for the configured dimensions. The star-tree structure helps faster traversal for the aggregated values across different dimensions.
Refer to star-tree index [structure](https://opensearch.org/docs/latest/search-plugins/star-tree-index/#star-tree-index-structure) to know more about the internals of STIX.

### Why to use star-tree index
#### Predictable latency
Currently in OpenSearch, aggregation query latency scales linearly with the number of documents. STIX provides faster and more predictable query latency since it pre-computes the resultant aggregations.

| Query | No. of documents | Plain query latency | Star tree query latency |
|-------|------------------|---------------------|------------------------|
| Avg(duration) for status = 200 | 200 Mil documents | 4.2 seconds | 6.3 milliseconds |
| Avg(duration) for status = 400 | 3000 documents | 5 milliseconds | 6.5 milliseconds |

#### Multi-aggregations
If there are multiple aggregations on different fields, each field for each aggregation is processed separately and it leads to higher query latencies. STIX whereas with native multi-field support can especially be more performant in such scenarios since it does pre-processes the queries across different aggregations to avoid multiple star-tree traversals.

#### Faster Throughput
For slow complex operations like date-histograms with sub-aggregations on large datasets, the throughput and latency improvement is also quite significant as seen below.

| Query | No. of documents | Default query latency | Star tree query latency [N = 10000] | Default Query Throughput | Star Tree Query Throughput |

Check failure on line 62 in _posts/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.TableHeadings] 'Default Query Throughput' is a table heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.TableHeadings] 'Default Query Throughput' is a table heading and should be in sentence case.", "location": {"path": "_posts/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index.md", "range": {"start": {"line": 62, "column": 92}}}, "severity": "ERROR"}

Check failure on line 62 in _posts/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.TableHeadings] 'Star Tree Query Throughput' is a table heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.TableHeadings] 'Star Tree Query Throughput' is a table heading and should be in sentence case.", "location": {"path": "_posts/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index.md", "range": {"start": {"line": 62, "column": 119}}}, "severity": "ERROR"}
|-------|------------------|----------------------|-------------------------------------|------------------------|--------------------------|
| Yearly Date histogram on sum of tip amount for passenger count 1 | 120 Mil documents | 13 seconds | 94 miliseconds | 0.08 | 2.01 |

Check failure on line 64 in _posts/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: miliseconds. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: miliseconds. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index.md", "range": {"start": {"line": 64, "column": 106}}}, "severity": "ERROR"}
| Yearly Date histogram on sum of tip amount for passenger count 1 to 2 | 140 Mil documents | 15 seconds | 114 miloseconds | 0.07 | 2.01 |

Check failure on line 65 in _posts/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: miloseconds. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: miloseconds. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index.md", "range": {"start": {"line": 65, "column": 112}}}, "severity": "ERROR"}
| Yearly Date histogram on sum of tip amount for passenger count 1 to 5 | 160 Mil documents | 17 seconds | 160 miliseconds | 0.06 | 2.01 |

Check failure on line 66 in _posts/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: miliseconds. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: miliseconds. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_posts/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index.md", "range": {"start": {"line": 66, "column": 112}}}, "severity": "ERROR"}

#### Flexibility
STIX can be configured with various [parameters](https://opensearch.org/docs/latest/field-types/supported-field-types/star-tree/#star-tree-index-configuration-options) to tune for storage overhead or performance.
For example, `max_leaf_docs` parameter can help in tuning the storage/performance of STIX. Higher number leads to better storage efficiency but lower query performance and vice versa.

Check warning on line 70 in _posts/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.LatinismsSubstitution] Use 'the other way around' instead of 'vice versa'. Raw Output: {"message": "[OpenSearch.LatinismsSubstitution] Use 'the other way around' instead of 'vice versa'.", "location": {"path": "_posts/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index.md", "range": {"start": {"line": 70, "column": 173}}}, "severity": "WARNING"}

| Query | No. of documents | Default query latency | Star tree query latency [N = 100] | Star tree query latency [N = 10000] |
|-------|------------------|----------------------|-----------------------------------|-----------------------------------|
| Metric aggregation for status = 200 | 200 Mil documents | 4.2 seconds | 2.5 milliseconds | 6.3 milliseconds |
| Metric aggregation for status = 400 | 3000 documents | 5 milliseconds | 2.7 milliseconds | 8.5 milliseconds |

#### Real time data
Features such as Index roll-up and transform lets users create a summarized view of users' data centered around certain fields but these solutions are not real time and affects live indexing performance.

STIX is completely real time, has low indexing overhead and queries work seamlessly without any user intervention to modify their queries. OpenSearch internally identifies which queries can be resolved faster using star-tree and

### Limitations
Even though STIX provides multiple benefits, it has it's limitations, they are:

* A star-tree index should only be enabled on indexes whose data is not updated or deleted because updates and deletions are not accounted presently in a star-tree index.
* Since STIX is created in real time as part of regular refresh/flush/merge flows of indexing, there is a cost attached to the write throughput of indexing. [ Benchmark numbers coming soon ]
* After a star-tree index is enabled, it cannot be disabled. In order to disable a star-tree index, the data in the index must be re-indexed without the star-tree mapping. Furthermore, changing a star-tree configuration will also require a re-index operation. However, search using star-tree can be disabled by setting `indices.composite_index.star_tree.enabled` to `false`.

Refer to [limitations](https://opensearch.org/docs/latest/search-plugins/star-tree-index/#limitations) to know more.

### How to use star-tree index
* You have to specify the STIX [mapping](https://opensearch.org/docs/latest/field-types/supported-field-types/star-tree/#star-tree-index-mappings) during index creation based on the queries and aggregations they’d like to optimize. Refer to [documentation](https://opensearch.org/docs/latest/field-types/supported-field-types/star-tree/) for detailed information.
* No changes are required in the search query syntax or the request parameters. Currently, only certain [aggregation](https://opensearch.org/docs/latest/search-plugins/star-tree-index/#supported-queries-and-aggregations) queries are resolved using star-tree index as of 2.19. OpenSearch internally identifies which queries can be resolved faster using STIX in real-time.

Apart from configuration during index creation, there is no additional maintenance or changes needed for STIX . Refer to [Search Plan](https://github.com/opensearch-project/OpenSearch/issues/15257) to see which aggregations will be supported in future.

### Conclusion
STIX can be used to massively improve the performance of aggregations as seen above.
Support for more aggregations and queries such as boolean query, date range query, nested aggregations and terms aggregations are coming very soon.

Check failure on line 99 in _posts/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'Boolean' instead of 'boolean'. Raw Output: {"message": "[Vale.Terms] Use 'Boolean' instead of 'boolean'.", "location": {"path": "_posts/2025-04-10-Super-Fast-Aggregations-in-OpenSearch-with-Star-tree-index.md", "range": {"start": {"line": 99, "column": 51}}}, "severity": "ERROR"}


Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading