-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Add streaming search with configurable scoring modes #19176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
❌ Gradle check result for 5554606: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 5554606: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 5554606: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Introduces streaming search infrastructure that enables progressive emission of search results with three configurable scoring modes. The implementation extends the existing streaming transport layer to support partial result computation at the coordinator level. Scoring modes: - NO_SCORING: Immediate result emission without confidence requirements - CONFIDENCE_BASED: Statistical emission using Hoeffding inequality bounds - FULL_SCORING: Complete scoring before result emission The implementation leverages OpenSearch's inter-node streaming capabilities to reduce query latency through early result emission. Partial reductions are triggered based on the selected scoring mode, with results accumulated at the coordinator before final response generation. Key changes: - Add HoeffdingBounds for statistical confidence calculation - Extend QueryPhaseResultConsumer to support streaming reduction - Add StreamingScoringCollector wrapping TopScoreDocCollector - Integrate streaming scorer selection in QueryPhase - Add REST parameter stream_scoring_mode for mode selection - Include streaming metadata in SearchResponse The current implementation operates within architectural constraints where streaming is limited to inter-node communication. Client-facing streaming will be addressed in a follow-up contribution. Addresses opensearch-project#18725 Signed-off-by: Atri Sharma <[email protected]>
Signed-off-by: Atri Sharma <[email protected]>
❌ Gradle check result for 6a4d92e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for 6a4d92e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Atri Sharma <[email protected]>
❌ Gradle check result for df7ad7b: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Atri Sharma <[email protected]>
❌ Gradle check result for c084b56: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Atri Sharma <[email protected]>
❌ Gradle check result for ad9c30d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
❌ Gradle check result for ad9c30d: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Signed-off-by: Atri Sharma <[email protected]>
❌ Gradle check result for b4b16b0: null Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Thank you @atris for working and owning it as this a big feature. I have very high level comments -
Once you split the PR, it would be great if we clearly describe the changes introduced to make it easier for others to pitch in here. We should open a RFC for client support too as this feature is not useful without client support. |
* Streaming search parameters for configuring progressive result emission. | ||
* These parameters control how and when intermediate results are sent. | ||
*/ | ||
public class StreamingSearchParameters implements Writeable, ToXContent { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can get rid of most of these params if we split this PR and start with simple case i.e. no scoring or sorting
keepAlive = in.readOptionalTimeValue(); | ||
originalIndices = OriginalIndices.readOriginalIndices(in); | ||
assert keepAlive == null || readerId != null : "readerId: " + readerId + " keepAlive: " + keepAlive; | ||
// Read streaming fields - gated on version for BWC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we already have isStreamingSearch()
why are we introducing a new one here?
import org.apache.lucene.search.TopDocs; | ||
import org.opensearch.common.annotation.ExperimentalApi; | ||
|
||
/** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move it to next PR
? (Exception) exception.getCause() | ||
: new OpenSearchException(exception.getCause()); | ||
} | ||
if (isStreamSearch && logger.isTraceEnabled()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't have to check logger.isTraceEnabled()
|
||
currentBatch.add(scoreDoc); | ||
if (currentBatch.size() >= batchSize) { | ||
emitCurrentBatch(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you check how we emit batches for aggregation cases currently?
let's try to be consistent with it. If we want to make logic of handling emission better, let's do it for both.
We also introduced the FlushMode
and currently defaulting to PER_SEGMENT
. Maybe we can start with PER_SEGMENT
and later make it intra segment i.e. based on batch size, let me know what do you think?
Signed-off-by: Atri Sharma <[email protected]>
❌ Gradle check result for cbf228d: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Summary
Implement streaming search on the coordinator, emitting early partial results from the query phase with optional scoring. The change introduces request flags and a mode selector, integrates streaming into the existing SearchAction
path, and adds a reproducible TTFB benchmark. When streaming is not used, behavior is unchanged.
Motivation
• Reduce time-to-first-byte (TTFB) at the coordinator by not waiting for all shards to complete query phase before starting fetch-eligible work.
• Provide mode-specific controls for batching and scoring, with safe defaults.
• Keep backward compatibility on the transport wire and preserve REST semantics.
Design and Scope
• Request flags and mode
• SearchRequest gains version-gated fields (V_3_3_0): streamingScoring (boolean) and streamingSearchMode (string).
• REST: stream=true enables streaming; optional stream_scoring_mode and streaming_mode select behavior.
• No change to default behavior; streaming is opt‑in.
• Coordinator streaming
• Streaming is integrated into TransportSearchAction (SearchAction). No separate transport action is required.
• SearchPhaseController.newSearchPhaseResults(...) returns either the existing QueryPhaseResultConsumer or a StreamQueryPhaseResultConsumer based on the request mode.
• StreamQueryPhaseResultConsumer controls partial reduce cadence via mode-specific multipliers and emits TopDocs-aware partials to the progress listener.
• Partial reduce notifications
• SearchProgressListener gains a TopDocs-aware hook with a compatibility fallback:
• onPartialReduceWithTopDocs(…) → defaults to onPartialReduce(…).
• notifyPartialReduceWithTopDocs(…) invokes the hook safely.
• Existing listeners are unaffected.
• Query execution
• For streaming queries, the QueryPhase routes to streaming collector contexts based on StreamingSearchMode:
• NO_SCORING: unsorted documents, fastest emission.
• SCORED_UNSORTED: scored documents without sort.
• SCORED_SORTED: scored, sorted via Lucene’s top-N collectors.
• CONFIDENCE_BASED: early emission guided by simple Hoeffding-style bounds.
• Collector batch size is bounded and read via SearchContext.getStreamingBatchSize(); partial batches are emitted to the stream channel when available.
• Transport integration
• Both the classic and stream transport handlers are registered:
• Classic: SearchTransportService.registerRequestHandler(…).
• Stream (if available): StreamSearchTransportService.registerStreamRequestHandler(…).
• The streaming transport path is selected only for streaming requests and used thread pools are chosen accordingly.
Settings and Controls
• Dynamic cluster settings for streaming are added (StreamingSearchSettings, node-scoped, dynamic). Examples:
• search.streaming.batch_size
• Mode-specific reduce multipliers, emission interval, and minimal doc thresholds
• Circuit breaker and limits for buffering in streaming code paths
• Defaults are conservative. The feature remains opt-in via request flags; settings do not change behavior unless the request is streaming.
Wire Compatibility and API
• Transport wire BWC
• New SearchRequest and ShardSearchRequest fields are gated by Version.V_3_3_0 on read/write. Older peers neither write nor read these fields.
• Public API
• No breaking changes to REST endpoints.
• SearchProgressListener adds new methods with safe defaults; existing code continues to compile and run.
Tests and Benchmark
• Unit tests:
• Stream consumer batch sizing and dynamic settings effects.
• Hoeffding bounds behavior.
• Integration tests:
• Basic streaming search workflows.
• Streaming aggregations with and without sub-aggregations.
• Mode coverage (NO_SCORING, SCORED_UNSORTED, SCORED_SORTED, CONFIDENCE_BASED).
• Benchmark:
• StreamingPerformanceBenchmarkTests: measures coordinator-side TTFB (time to first partial reduce) vs. classic full reduce for a large query.
• Logger-only reporting; no REST streaming is introduced.
Non-Goals / Limitations
• This change does not implement HTTP/REST streaming of partial responses.
• The SearchResponse partial/sequence metadata used internally by the streaming listener is not serialized on the wire and does not alter REST payloads.
• Confidence-based mode uses a conservative and simple bound; it is adequate for early gating but not a full ranking stability analysis.
Backward Compatibility and Risk
• Default behavior unchanged unless streaming flags are provided.
• Wire BWC ensured via version gating; JApiCmp passes.
• Aggregation partial reductions are unaffected; for TopDocs partials we call the new TopDocs-aware hook, otherwise we continue to notify via the existing method.
Operational Notes
• Streaming is disabled by default and must be explicitly requested with stream=true (REST) or by setting SearchRequest flags programmatically.
• Mode selection allows tuning for latency vs. coordination cost.
• Dynamic settings enable safe runtime tuning if necessary.
If reviewers prefer, I can split the settings and the confidence-based collector into a follow-up to further reduce the initial surface.