You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: _posts/2025-03-31-zscore-hybrid-search.md
+29-23
Original file line number
Diff line number
Diff line change
@@ -4,6 +4,7 @@ title: Z Score Normalization Technique for Hybrid Search
4
4
authors:
5
5
- kazabdu
6
6
- gaievski
7
+
- minalsha
7
8
date: 2025-03-31
8
9
has_science_table: true
9
10
categories:
@@ -13,7 +14,7 @@ meta_description: Learn about z score normalization using the Neural Search plug
13
14
---
14
15
15
16
In the world of search engines and machine learning, data normalization plays a crucial role in ensuring fair and accurate comparisons between different features or scores.
16
-
Hybrid query uses multiple normalization techniques for preparing final results, main two types are score based normalization and rank base combination. In score base normalization, min-max normalization doesn’t work well with outliers (Outliers are data points that significantly differ from other observations in a dataset.
17
+
Hybrid query uses multiple normalization techniques for preparing final results, main two types are score based normalization and rank base combination. In score base normalization, min-max normalization(default normalization technique) doesn’t work well with outliers (Outliers are data points that significantly differ from other observations in a dataset.
17
18
In the context of normalization techniques like Min-Max scaling and Z-score (Standard Score) normalization, outliers can have a substantial impact on the results). In this blogpost we would introduce another normalization technique called as z-score which was added in OpenSearch 3.0.0-beta release.
18
19
Let's dive into what Z-score normalization is, why it's important, and how it's being used in OpenSearch.
19
20
@@ -54,9 +55,6 @@ PUT /_search/pipeline/z_score-pipeline
54
55
"normalization-processor": {
55
56
"normalization": {
56
57
"technique": "z_score"
57
-
},
58
-
"combination": {
59
-
"technique": "arithmetic_mean"
60
58
}
61
59
}
62
60
}
@@ -85,7 +83,19 @@ POST my_index/_search?search_pipeline=z_score-pipeline
85
83
86
84
## Benchmarking Z Score performance
87
85
88
-
Benchmark experiments were conducted using an OpenSearch cluster consisting of a single r6g.8xlarge instance as the coordinator node, along with three r6g.8xlarge instances as data nodes. To assess Z Score’s performance comprehensively, we measured three key metrics across four distinct datasets. For information about the datasets used, see [Datasets](https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/).
86
+
Benchmark experiments were conducted using an OpenSearch cluster consisting of a single r6g.8xlarge instance as the coordinator node, along with three r6g.8xlarge instances as data nodes. To assess Z Score’s performance comprehensively, we measured two key metrics across four distinct datasets. For information about the datasets used, see [Datasets](https://public.ukp.informatik.tu-darmstadt.de/thakur/BEIR/datasets/).
87
+
88
+
### Sample queries and passages
89
+
90
+
The following table provides sample queries and passages for each dataset.
91
+
92
+
|Dataset |Sample query |Sample passage |
93
+
|:--- |:--- |:--- |
94
+
|Scidocs |CFD Analysis of Convective Heat Transfer Coefficient on External Surfaces of Buildings |`This paper provides an overview of the application of CFD in building performance simulation for the outdoor environment, focused on four topics...`|
95
+
|FiQA |“Business day” and “due date” for bills |`I don't believe Saturday is a business day either. When I deposit a check at a bank's drive-in after 4pm Friday, the receipt tells me it will credit as if I deposited on Monday. If a business' computer doesn't adjust their billing to have a weekday due date ... `|
96
+
|nq |what is non controlling interest on balance sheet |`In accounting, minority interest (or non-controlling interest) is the portion of a subsidiary corporation's stock that is not owned by the parent corporation. The magnitude of the minority interest in the subsidiary company is generally less than 50% of outstanding shares, or the corporation would generally cease to be a subsidiary of the parent`|
97
+
|ArguAna |Poaching is becoming more advanced A stronger, militarised approach is needed as poaching is becoming ... |`Tougher protection of Africa\u2019s nature reserves will only result in more bloodshed. Every time the military upgrade their weaponry, tactics and logistic, the poachers improve their own methods to counter ...`|
98
+
89
99
90
100
Search relevance was quantified using the industry-standard Normalized Discounted Cumulative Gain at rank 10 (NDCG@10). We also tracked system performance using search latency measurements. This setup provided a strong foundation for evaluating both search quality and operational efficiency.
91
101
@@ -94,11 +104,11 @@ Search relevance was quantified using the industry-standard Normalized Discounte
94
104
95
105
|dataset |Hybrid (min max) |Hybrid (z score) |Percent diff |
96
106
|--- |--- |--- |--- |
97
-
|scidocs |0.1591 |0.1633 |+2..45% |
98
-
|fiqa |0.2747 |0.2768 |0.77% |
99
-
|nq |0.3665 |0.374 |2.05% |
100
-
|arguana |0.4507 |0.467 ||
101
-
|||Average |1.765|
107
+
|scidocs |0.1591 |0.1633 |+2.45% |
108
+
|fiqa |0.2747 |0.2768 |+0.77% |
109
+
|nq |0.3665 |0.374 |+2.05% |
110
+
|arguana |0.4507 |0.467 |+3.62%|
111
+
|||Average |2.22%|
102
112
103
113
### Search latency
104
114
@@ -114,8 +124,8 @@ Our benchmark experiments highlight the following advantages and trade-offs of Z
114
124
115
125
**Search quality (measured using NDCG@10 across four datasets)**:
116
126
117
-
* Z-score normalization shows a modest improvement in search quality, with an average increase of 1.765% in NDCG@10 scores.
118
-
* This suggests that Z-score normalization may provide slightly better relevance in search results compared to min-max normalization.
127
+
* Z-score normalization shows a modest improvement in search quality, with an average increase of 2.2% in NDCG@10 scores.
128
+
* This suggests that Z-score normalization may provide slightly better relevance in search results compared to the default normalization technique min-max.
119
129
120
130
121
131
**Latency impact**:
@@ -124,19 +134,15 @@ Our benchmark experiments highlight the following advantages and trade-offs of Z
124
134
125
135
|Latency percentile |Percent difference |
126
136
|--- |--- |
127
-
|
128
-
p50 |0.72% |
129
-
|--- |--- |
130
-
|
131
-
p90 |0.50% |
132
-
|
133
-
p99 |0.64% |
137
+
|p50 |0.72% |
138
+
|p90 |0.50% |
139
+
|p99 |0.64% |
134
140
135
141
* The positive percentages indicate that Z-score normalization has slightly higher latency compared to min-max normalization, but the differences are minimal (less than 1% on average).
136
142
137
143
**Trade-offs**:
138
144
139
-
* There's a slight trade-off between search quality and latency. Z-score normalization offers a small improvement in search relevance (1.765% increase in NDCG@10) at the cost of a marginal increase in latency (0.50% to 0.72% across different percentiles).
145
+
* There's a slight trade-off between search quality and latency. Z-score normalization offers a improvement in search relevance (2.2% increase in NDCG@10) at the cost of a marginal increase in latency (0.50% to 0.72% across different percentiles).
140
146
141
147
**Consistency**:
142
148
@@ -147,19 +153,19 @@ p99 |0.64% |
147
153
**Overall assessment**:
148
154
149
155
* Z-score normalization provides a modest improvement in search quality with a negligible impact on latency.
150
-
* The choice between Z-score and min-max normalization may depend on specific use cases, with Z-score potentially being preferred when even small improvements in search relevance are valuable and the slight latency increase is acceptable.
156
+
* The choice between Z-score and min-max normalization may depend on specific use cases, with Z-score potentially being preferred when even improvements in search relevance are valuable and the slight latency increase is acceptable.
151
157
152
158
These findings suggest that Z-score normalization could be a viable alternative to min-max normalization in hybrid search approaches, particularly in scenarios where optimizing search relevance is a priority and the system can tolerate minimal latency increases
153
159
154
160
155
161
156
162
## What’s next?
157
163
158
-
We are also expanding OpenSearch’s hybrid search capabilities beyond z score by planning the following improvements to our normalization framework:
164
+
We are also expanding OpenSearch’s hybrid search capabilities beyond z score by planning the following enhancements to our normalization framework:
159
165
160
166
**Custom normalization functions**: Enables you to define your own normalization logic and allows fine-tuning of search result rankings. For more information, see [this issue](https://github.com/opensearch-project/neural-search/issues/994).
161
167
162
-
These improvements will provide more control over search result ranking while ensuring reliable and consistent hybrid search outcomes. Stay tuned for more information!
168
+
These enhancements will provide more control over search result ranking while ensuring reliable and consistent hybrid search outcomes. Stay tuned for more information!
0 commit comments