Add streaming search with configurable scoring modes #19176

atris · 2025-08-28T21:59:37Z

Summary

Implement streaming search on the coordinator, emitting early partial results from the query phase with optional scoring. The change introduces request flags and a mode selector, integrates streaming into the existing SearchAction
path, and adds a reproducible TTFB benchmark. When streaming is not used, behavior is unchanged.

Motivation

• Reduce time-to-first-byte (TTFB) at the coordinator by not waiting for all shards to complete query phase before starting fetch-eligible work.
• Provide mode-specific controls for batching and scoring, with safe defaults.
• Keep backward compatibility on the transport wire and preserve REST semantics.

Design and Scope

• Request flags and mode
• SearchRequest gains version-gated fields (V_3_3_0): streamingScoring (boolean) and streamingSearchMode (string).
• REST: stream=true enables streaming; optional stream_scoring_mode and streaming_mode select behavior.
• No change to default behavior; streaming is opt‑in.
• Coordinator streaming
• Streaming is integrated into TransportSearchAction (SearchAction). No separate transport action is required.
• SearchPhaseController.newSearchPhaseResults(...) returns either the existing QueryPhaseResultConsumer or a StreamQueryPhaseResultConsumer based on the request mode.
• StreamQueryPhaseResultConsumer controls partial reduce cadence via mode-specific multipliers and emits TopDocs-aware partials to the progress listener.
• Partial reduce notifications
• SearchProgressListener gains a TopDocs-aware hook with a compatibility fallback:
• onPartialReduceWithTopDocs(…) → defaults to onPartialReduce(…).
• notifyPartialReduceWithTopDocs(…) invokes the hook safely.
• Existing listeners are unaffected.
• Query execution
• For streaming queries, the QueryPhase routes to streaming collector contexts based on StreamingSearchMode:
• NO_SCORING: unsorted documents, fastest emission.
• SCORED_UNSORTED: scored documents without sort.
• SCORED_SORTED: scored, sorted via Lucene’s top-N collectors.
• CONFIDENCE_BASED: early emission guided by simple Hoeffding-style bounds.
• Collector batch size is bounded and read via SearchContext.getStreamingBatchSize(); partial batches are emitted to the stream channel when available.
• Transport integration
• Both the classic and stream transport handlers are registered:
• Classic: SearchTransportService.registerRequestHandler(…).
• Stream (if available): StreamSearchTransportService.registerStreamRequestHandler(…).
• The streaming transport path is selected only for streaming requests and used thread pools are chosen accordingly.

Settings and Controls

• Dynamic cluster settings for streaming are added (StreamingSearchSettings, node-scoped, dynamic). Examples:
• search.streaming.batch_size
• Mode-specific reduce multipliers, emission interval, and minimal doc thresholds
• Circuit breaker and limits for buffering in streaming code paths
• Defaults are conservative. The feature remains opt-in via request flags; settings do not change behavior unless the request is streaming.

Wire Compatibility and API

• Transport wire BWC
• New SearchRequest and ShardSearchRequest fields are gated by Version.V_3_3_0 on read/write. Older peers neither write nor read these fields.
• Public API
• No breaking changes to REST endpoints.
• SearchProgressListener adds new methods with safe defaults; existing code continues to compile and run.

Tests and Benchmark

• Unit tests:
• Stream consumer batch sizing and dynamic settings effects.
• Hoeffding bounds behavior.
• Integration tests:
• Basic streaming search workflows.
• Streaming aggregations with and without sub-aggregations.
• Mode coverage (NO_SCORING, SCORED_UNSORTED, SCORED_SORTED, CONFIDENCE_BASED).
• Benchmark:
• StreamingPerformanceBenchmarkTests: measures coordinator-side TTFB (time to first partial reduce) vs. classic full reduce for a large query.
• Logger-only reporting; no REST streaming is introduced.

Non-Goals / Limitations

• This change does not implement HTTP/REST streaming of partial responses.
• The SearchResponse partial/sequence metadata used internally by the streaming listener is not serialized on the wire and does not alter REST payloads.
• Confidence-based mode uses a conservative and simple bound; it is adequate for early gating but not a full ranking stability analysis.

Backward Compatibility and Risk

• Default behavior unchanged unless streaming flags are provided.
• Wire BWC ensured via version gating; JApiCmp passes.
• Aggregation partial reductions are unaffected; for TopDocs partials we call the new TopDocs-aware hook, otherwise we continue to notify via the existing method.

Operational Notes

• Streaming is disabled by default and must be explicitly requested with stream=true (REST) or by setting SearchRequest flags programmatically.
• Mode selection allows tuning for latency vs. coordination cost.
• Dynamic settings enable safe runtime tuning if necessary.

If reviewers prefer, I can split the settings and the confidence-based collector into a follow-up to further reduce the initial surface.

github-actions · 2025-08-28T22:22:05Z

❌ Gradle check result for 5554606: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions · 2025-08-29T02:31:11Z

❌ Gradle check result for 5554606: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions · 2025-08-29T03:10:50Z

❌ Gradle check result for 5554606: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Introduces streaming search infrastructure that enables progressive emission of search results with three configurable scoring modes. The implementation extends the existing streaming transport layer to support partial result computation at the coordinator level. Scoring modes: - NO_SCORING: Immediate result emission without confidence requirements - CONFIDENCE_BASED: Statistical emission using Hoeffding inequality bounds - FULL_SCORING: Complete scoring before result emission The implementation leverages OpenSearch's inter-node streaming capabilities to reduce query latency through early result emission. Partial reductions are triggered based on the selected scoring mode, with results accumulated at the coordinator before final response generation. Key changes: - Add HoeffdingBounds for statistical confidence calculation - Extend QueryPhaseResultConsumer to support streaming reduction - Add StreamingScoringCollector wrapping TopScoreDocCollector - Integrate streaming scorer selection in QueryPhase - Add REST parameter stream_scoring_mode for mode selection - Include streaming metadata in SearchResponse The current implementation operates within architectural constraints where streaming is limited to inter-node communication. Client-facing streaming will be addressed in a follow-up contribution. Addresses opensearch-project#18725 Signed-off-by: Atri Sharma <[email protected]>

Signed-off-by: Atri Sharma <[email protected]>

github-actions · 2025-09-27T19:21:05Z

❌ Gradle check result for 6a4d92e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions · 2025-09-27T19:35:41Z

❌ Gradle check result for 6a4d92e: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Atri Sharma <[email protected]>

github-actions · 2025-09-27T20:20:49Z

❌ Gradle check result for df7ad7b: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Atri Sharma <[email protected]>

github-actions · 2025-10-09T05:22:45Z

❌ Gradle check result for c084b56: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Atri Sharma <[email protected]>

github-actions · 2025-10-09T06:05:12Z

❌ Gradle check result for ad9c30d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions · 2025-10-09T08:11:03Z

❌ Gradle check result for ad9c30d: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Atri Sharma <[email protected]>

github-actions · 2025-10-10T21:44:15Z

❌ Gradle check result for b4b16b0: null

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

rishabhmaurya · 2025-10-11T01:23:01Z

Thank you @atris for working and owning it as this a big feature.

I have very high level comments -

This is a huge PR and a bit hard to review, we need to think about splitting this PR. May be start with the first requirement as mentioned here - [RFC] New search streaming API #18725 (comment)? Where this streaming search will be most useful. And start with FlushMode as PER_SEGMENT to simplify it further.
Did you get a chance to get to some benchmark numbers? This feature is mostly governed from 2 angles - 1) TTFB and 2) overall resource consumption per query fetching all pages compared to traditional approach. Without these numbers it would be hard to justify this effort. Theoretically it should help in both, but some initial numbers while you work on cleaning/refactoring rest of the PR is much appreciated.

Once you split the PR, it would be great if we clearly describe the changes introduced to make it easier for others to pitch in here.

We should open a RFC for client support too as this feature is not useful without client support.

rishabhmaurya · 2025-10-11T01:31:57Z

server/src/main/java/org/opensearch/search/builder/StreamingSearchParameters.java

+ * Streaming search parameters for configuring progressive result emission.
+ * These parameters control how and when intermediate results are sent.
+ */
+public class StreamingSearchParameters implements Writeable, ToXContent {


We can get rid of most of these params if we split this PR and start with simple case i.e. no scoring or sorting

rishabhmaurya · 2025-10-11T01:33:26Z

server/src/main/java/org/opensearch/search/internal/ShardSearchRequest.java

        keepAlive = in.readOptionalTimeValue();
        originalIndices = OriginalIndices.readOriginalIndices(in);
        assert keepAlive == null || readerId != null : "readerId: " + readerId + " keepAlive: " + keepAlive;
+        // Read streaming fields - gated on version for BWC


we already have isStreamingSearch() why are we introducing a new one here?

rishabhmaurya · 2025-10-11T01:33:53Z

server/src/main/java/org/opensearch/search/query/BoundProvider.java

+import org.apache.lucene.search.TopDocs;
+import org.opensearch.common.annotation.ExperimentalApi;
+
+/**


let's move it to next PR

rishabhmaurya · 2025-10-11T01:36:27Z

server/src/main/java/org/opensearch/search/SearchService.java

                    ? (Exception) exception.getCause()
                    : new OpenSearchException(exception.getCause());
            }
+            if (isStreamSearch && logger.isTraceEnabled()) {


you don't have to check logger.isTraceEnabled()

rishabhmaurya · 2025-10-11T01:43:27Z

server/src/main/java/org/opensearch/search/query/StreamingUnsortedCollectorContext.java

+
+                    currentBatch.add(scoreDoc);
+                    if (currentBatch.size() >= batchSize) {
+                        emitCurrentBatch(false);


can you check how we emit batches for aggregation cases currently?
let's try to be consistent with it. If we want to make logic of handling emission better, let's do it for both.
We also introduced the FlushMode and currently defaulting to PER_SEGMENT. Maybe we can start with PER_SEGMENT and later make it intra segment i.e. based on batch size, let me know what do you think?

Signed-off-by: Atri Sharma <[email protected]>

github-actions · 2025-10-11T20:44:06Z

❌ Gradle check result for cbf228d: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

atris requested review from a team, Bukhtawar, CEHENKLE, Rishikesh1159, VachaShah, anasalkouz, andrross, ashking94, cwperks, dbwiddis, gbbafna, jed326, kotwanikunal, mch2, msfroh, owaiskazi19, reta, sachinpkale, saratvemulapalli, shwetathareja and sohami as code owners August 28, 2025 21:59

atris mentioned this pull request Aug 28, 2025

[DRAFT] Add streaming search with configurable scoring modes #19160

Closed

atris closed this Aug 29, 2025

atris reopened this Aug 29, 2025

atris closed this Aug 29, 2025

atris reopened this Aug 29, 2025

More cleanup

6a4d92e

Signed-off-by: Atri Sharma <[email protected]>

atris closed this Sep 27, 2025

github-project-automation bot moved this from In-Review to Done in Performance Roadmap Sep 27, 2025

atris reopened this Sep 27, 2025

github-project-automation bot moved this from Done to In Progress in Performance Roadmap Sep 27, 2025

Fix forbidden API issue

df7ad7b

Signed-off-by: Atri Sharma <[email protected]>

opensearch-ci-bot mentioned this pull request Sep 27, 2025

[AUTOCUT] Gradle Check Flaky Test Report for IngestFromKinesisIT #17678

Open

Merge branch 'main' into streaming-scoring-clean

c084b56

Signed-off-by: Atri Sharma <[email protected]>

Fix build issues

ad9c30d

Signed-off-by: Atri Sharma <[email protected]>

atris closed this Oct 9, 2025

github-project-automation bot moved this from In Progress to Done in Performance Roadmap Oct 9, 2025

atris reopened this Oct 9, 2025

github-project-automation bot moved this from Done to In Progress in Performance Roadmap Oct 9, 2025

More shenanigans

b4b16b0

Signed-off-by: Atri Sharma <[email protected]>

atris requested a review from peternied as a code owner October 10, 2025 19:41

rishabhmaurya reviewed Oct 11, 2025

View reviewed changes

Remove confidence based streaming

cbf228d

Signed-off-by: Atri Sharma <[email protected]>

Add streaming search with configurable scoring modes #19176

Are you sure you want to change the base?

Add streaming search with configurable scoring modes #19176

Uh oh!

Conversation

atris commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 28, 2025

Uh oh!

github-actions bot commented Aug 29, 2025

Uh oh!

github-actions bot commented Aug 29, 2025

Uh oh!

github-actions bot commented Sep 27, 2025

Uh oh!

github-actions bot commented Sep 27, 2025

Uh oh!

github-actions bot commented Sep 27, 2025

Uh oh!

github-actions bot commented Oct 9, 2025

Uh oh!

github-actions bot commented Oct 9, 2025

Uh oh!

github-actions bot commented Oct 9, 2025

Uh oh!

github-actions bot commented Oct 10, 2025

Uh oh!

rishabhmaurya commented Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rishabhmaurya Oct 11, 2025

Choose a reason for hiding this comment

Uh oh!

rishabhmaurya Oct 11, 2025

Choose a reason for hiding this comment

Uh oh!

rishabhmaurya Oct 11, 2025

Choose a reason for hiding this comment

Uh oh!

rishabhmaurya Oct 11, 2025

Choose a reason for hiding this comment

Uh oh!

rishabhmaurya Oct 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Oct 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

atris commented Aug 28, 2025 •

edited

Loading

rishabhmaurya commented Oct 11, 2025 •

edited

Loading

rishabhmaurya Oct 11, 2025 •

edited

Loading