SOLR-17447 : Support to early terminate a search based on maxHitsAllowed per shard. #2960

sijuv · 2025-01-07T17:23:11Z

Description

Adds the capability to early terminate a search based on maxHits parameter provided

https://issues.apache.org/jira/browse/SOLR-17447

Solution

"maxHitsPerShard" request parameter controls how many hits the searcher should go over per shard. Once the searcher runs over the specfied number of documents, it will terminate the search with EarlyTerminatingCollectorException. This will be indicated by a new response header "terminatedEarly" also the "partialResults" will indicate that the results are partial. This parameter is supported in MT mode as well.

Though there are other mechanisms to control runaway queries with CPU usage limits and time limits, this is simpler for certain use cases esp in case high recall queries and rerank use cases.

Lucene currently supports this feature with the EarlyTerminatingCollector. There was some code in SOLR as well to support the collector, but looks like it was not completely wired up.

Tests

Ran tests against a local solr instance in MT and single threaded mode

Checklist

Please review the following and check all that apply:

[ X] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
[ X] I have created a Jira issue and added the issue ID to my pull request title.
I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
[ X] I have developed this patch against the main branch.
[X ] I have run ./gradlew check.
I have added tests for my changes.
I have added documentation for the Reference Guide

dsmiley

Overall this makes sense to me. Thanks for the nice contribution! I'd prefer if @gus-asf could take a look at some aspects since he worked on cpuAllowed recently. BTW it's clear you need to run ./gradlew tidy

I suggest renaming the param "maxHitsPerShard" to simply "maxHits" or "maxHitsTerminateEarly" and document that it's per-shard and best-effort; that more hits may ultimately be detected in aggregate. But maybe others disagree.

It'd be good to consider interference with other features. Maybe it works with cursorMark? Does it count after PostFilter (e.g. CollapseQParser) or before?

sijuv · 2025-01-09T19:15:38Z

Overall this makes sense to me. Thanks for the nice contribution! I'd prefer if @gus-asf could take a look at some aspects since he worked on cpuAllowed recently. BTW it's clear you need to run ./gradlew tidy
Addressed.

I suggest renaming the param "maxHitsPerShard" to simply "maxHits" or "maxHitsTerminateEarly" and document that it's per-shard and best-effort; that more hits may ultimately be detected in aggregate. But maybe others disagree.

Renamed to maxHits

It'd be good to consider interference with other features. Maybe it works with cursorMark? Does it count after PostFilter (e.g. CollapseQParser) or before?
Sorry, I am not aware of cursorMark. This will count while the collector runs over the posting list so it is not a post filter. Not sure if I understood your question correctly,

Adds the capability to limit hits per shard. "maxHitsPerShard" request parameter controls how many hits the searcher should run over per shard. Once the searcher runs over the specfied number of documents, it will terminate the search with EarlyTerminatingCollectorException. This will be indicated by a new response header "terminatedEarly" also the "partialResults" will indicate that the results are partial. This parameter is supported in MT mode as well. Though there are other mechanisms to control runaway queries with CPU usage limits and time limits, this is simpler for certain use cases esp in case high recall queries and rerank use cases.

rename parameter to maxHits

sijuv · 2025-01-22T18:43:34Z

@dsmiley can you pls relook when you get a chance ?

dsmiley · 2025-01-23T05:31:40Z

note: you force-pushed to this PR. Please don't do that; it resets the review state making it hard for me to see changes since my last review and if I try it's still possible an earlier commit was changed and I won't know it. So never force-push to a PR that has been reviewed.

dsmiley

Recommended additional reviewers based on touching similar functionality: @atris @gus-asf @cpoerschke

solr/core/src/java/org/apache/solr/response/SolrQueryResponse.java

solr/core/src/java/org/apache/solr/search/EarlyTerminatingCollector.java

solr/core/src/java/org/apache/solr/search/MultiThreadedSearcher.java

solr/core/src/java/org/apache/solr/search/QueryCommand.java

dsmiley · 2025-01-23T06:22:34Z

solr/core/src/java/org/apache/solr/search/QueryResult.java

@@ -22,6 +22,7 @@ public class QueryResult {
  // Object for back compatibility so that we render true not "true" in json
  private Object partialResults;
  private Boolean segmentTerminatedEarly;


thinking out loud: Perhaps we should do away with segmentTerminatedEarly as overly specific

Agreed, I'm not sure how people are going to respond differently between the two

I think we need to distinguish the 2 because the cause for them is different . segmentTerminateEarly could occur because of the searcher not proceeding on a sorted segment because the remaning docs are of lower score. The terminateEarly is purely due to the searcher running past the provided number of maxHits. As a user if I see terminateEarly then I know I might need to increase the maxHits parameter.

Okay; so the cause is interesting for diagnostic / observability purposes but semantically, the search has "terminated early" (for whatever reason).

I agree it's good to know why so that the corrective action is clear, but terminateEarly is unlike maxHits (or shardMaxHits) so the relationship takes special knowledge to appreciate. If something is going to report a reason I'd want it to include the same phrase... i.e. "maxHitsReached" and definately would not want to have to "just know" that terminateEarly has nothing to do with segmentTerminateEarly despite strong similarity in naming.

I agree with "maxShardHitsReached" does make more sense than "terminatedEarly" for me... Especially given that "segmentTerminatedEarly" is completely unrelated, as gus mentioned.

dsmiley · 2025-01-23T06:27:57Z

solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java

@@ -338,6 +339,11 @@ private Collector buildAndRunCollectorChain(
      if (cmd.isQueryCancellable()) {
        core.getCancellableQueryTracker().removeCancellableQuery(cmd.getQueryID());
      }
+      if (collector instanceof final EarlyTerminatingCollector earlyTerminatingCollector) {


this check is brittle since it can only possibly true if the last collector is earlyTerminatingCollector. We may add others or re-order at some point. Also, can we just depend on the exception above and set qr.setTerminatedEarly(true); there?

dsmiley · 2025-01-23T14:03:24Z

A test is needed.

solr/solr-ref-guide/modules/query-guide/pages/common-query-parameters.adoc

sijuv · 2025-01-28T19:47:30Z

A test is needed.

added tests.

sijuv · 2025-01-28T19:48:10Z

@dsmiley @HoustonPutman updated, pls take a look when you get a chance.

gus-asf · 2025-01-30T15:25:04Z

Overall this makes sense to me. Thanks for the nice contribution! I'd prefer if @gus-asf could take a look at some aspects since he worked on cpuAllowed recently. BTW it's clear you need to run ./gradlew tidy
Addressed.
I suggest renaming the param "maxHitsPerShard" to simply "maxHits" or "maxHitsTerminateEarly" and document that it's per-shard and best-effort; that more hits may ultimately be detected in aggregate. But maybe others disagree.

Renamed to maxHits

maybe shardMaxHits? Brevity is nice, but I think it's helpful to have a name that is at least slightly self documenting.

gus-asf

Overall I think this is a great idea. For some use cases this is certainly appropriate. My primary concern is that the use of "terminate early" is getting pretty overloaded. There is code relating to lucene segmentTerminateEarly, and terminateEarly was previously used in SOLR-3240, and either you have stepped on some of that code, possibly breaking things if it isn't dead code from a previously removed feature. My suggestion is that this not add a 3rd "terminateEarly" but rather use it's functional behavior as a name... "searcherMaxHits" or maybe "shardMaxHits" and edit that more easily destinguished and self documenting name through the various code locations where we check it or name things after it.

Finally I wonder if this and spellcheck_collate_max_docs should be doing exactly the same thing, or if they should be entirely separate. Should the spellcheck feature actually be using this under the covers now that it exists (or is this really just exposing the same idea for non-spellcheck cases)?

solr/solr-ref-guide/modules/query-guide/pages/common-query-parameters.adoc

solr/core/src/java/org/apache/solr/search/EarlyTerminatingCollector.java

gus-asf · 2025-01-30T16:18:14Z

solr/core/src/java/org/apache/solr/search/EarlyTerminatingCollector.java

  private final int maxDocsToCollect;

  private int numCollected = 0;
  private int prevReaderCumulativeSize = 0;
  private int currentReaderSize = 0;
+  private final LongAdder pendingDocsToCollect;


LongAdder is a neat class I wasn't aware of, but my read of the javadocs for LongAdder is that it's for rarely read values like statistics/metrics (frequent update, infrequent read)... the way you are using it you read from it immediately after every add. Maybe just AtomicLong? Happy to hear arguments to the contrary, but everyone knows what an AtomicLong is so slightly simpler to have that (and perhaps slightly less memory usage according to the javadocs).

Yeah as @sijuv mentioned in the other thread, I believe he is only actually checking this every 100 docs, not after every add.

gus-asf · 2025-01-30T16:28:11Z

solr/core/src/java/org/apache/solr/search/EarlyTerminatingCollector.java

+        if (numCollected % chunkSize == 0) {
+          pendingDocsToCollect.add(chunkSize);
+          final long overallCollectedDocCount = pendingDocsToCollect.intValue();
+          terminatedEarly = overallCollectedDocCount >= maxDocsToCollect;


I think the code would be more legible if you just threw in two cases rather than tracking it with a boolean that is overwritten. This initially looked like a logic error. I expected to an |= until I stopped and thought about it for a while to convince myself that the second case can't be false after the first is true. Seems better to me if future maintainers don't have to sort that out for themselves.

the boolean is updated only every 100th time to reduce any overhead updating the thread shared adder brings in.

solr/core/src/java/org/apache/solr/search/EarlyTerminatingCollector.java

solr/core/src/java/org/apache/solr/search/QueryCommand.java

solr/core/src/java/org/apache/solr/search/MultiThreadedSearcher.java

solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java

…minateEarly

solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java

…sed on inputs

…, start counting approximateHits

HoustonPutman · 2025-04-02T19:36:28Z

Ok I've renamed to maxHitsAllowed, which follows a lot of the other shard-specific limits. (like timeAllowed or cpuAllowed)

HoustonPutman · 2025-04-03T14:35:42Z

I need to add a changelog entry, but I think this is good to go, and will probably merge sometime tomorrow.

…ry block

HoustonPutman · 2025-04-04T19:26:50Z

I'm planning on merging this early next week in case either of you have additional comments @dsmiley @gus-asf .

I'm going to make a separate ticket and PR to remove the terminateEarly stuff in 10x, because it will be unused after this PR.

dsmiley · 2025-04-04T20:22:12Z

Your changes since my last review are nice/simple. I'm not sure how to resolve maybe differing views on what should be consolidated or removed, but happy to defer to your judgement Houston (or Gus). At least hearing "remove ___(whatever)" is probably generally a good thing for simplicity :-)

HoustonPutman · 2025-04-04T20:41:59Z

Your changes since my last review are nice/simple. I'm not sure how to resolve maybe differing views on what should be consolidated or removed, but happy to defer to your judgement Houston (or Gus). At least hearing "remove ___(whatever)" is probably generally a good thing for simplicity :-)

I mean, this really is just a refactoring (loose interpretation of that word) of what already existed, and giving it a better name and user-facing support. This was only really used in 1 place (Spellcheck Collation), and didn't work in every way that Solr supports searching now. So "removing" isn't really true, it's really replacing an internal way of doing something. Hopefully no one disagrees about that part 😅

As for the response difference between segmentTerminatedEarly and maxHitsTerminatedEarly in the responseHeader, I think it's probably best to keep those separate. segmentTerminatedEarly really only tells you that the numFound information is wrong, since we know that because of the sort, the best documents were found. As for maxHitsTerminatedEarly, numFound will likely be wrong (as with the previous flag), but the results are also probably not the most relevant, since we don't have the sorting guarantees that segmentTerminatedEarly gives. So I think it does give useful information back to the "caller".

I'm also not against cleaning this up in the future, having a better way of exposing the ways in which the response isn't giving the exactly correct information. That would be a big enough change affecting multiple features (maxHitsAllowed, circuitBreakers, blockMax-wand, segmentTerminatedEarly, etc) that it probably belongs in its own ticket/PR.

…ed per shard (#2960) "terminateEarly", used by Spellcheck Collation, now uses maxHitsAllowed, which uses the same EarlyTerminationCollector under the hood. Co-authored-by: Siju Varghese <[email protected]> Co-authored-by: Houston Putman <[email protected]> (cherry picked from commit 900bf3d)

#2960 follow-up

#2960 follow-up (cherry picked from commit 81d0f6b)

apache#2960 follow-up

github-actions bot added client:solrj cat:search labels Jan 7, 2025

dsmiley reviewed Jan 7, 2025

View reviewed changes

sijuv changed the title ~~SOLR-17447 : Support for early terminate a search based on maxHits per collector.~~ SOLR-17447 : Support to early terminate a search based on maxHits per collector. Jan 9, 2025

sijuv force-pushed the main branch 2 times, most recently from e4a7c17 to f475bf7 Compare January 11, 2025 20:23

github-actions bot added the documentation Improvements or additions to documentation label Jan 11, 2025

Siju Varghese added 4 commits January 13, 2025 08:43

codestyle fixes

ee924a7

SOLR-17447 : Support for maxHitsPerShard.

96f1c3f

rename parameter to maxHits

updated documentation

74ac24c

sijuv force-pushed the main branch from f475bf7 to 74ac24c Compare January 13, 2025 16:57

dsmiley reviewed Jan 23, 2025

View reviewed changes

dsmiley requested review from atris, cpoerschke and gus-asf January 23, 2025 06:37

Merge branch 'apache:main' into main

da025e4

HoustonPutman reviewed Jan 28, 2025

View reviewed changes

solr/solr-ref-guide/modules/query-guide/pages/common-query-parameters.adoc Outdated Show resolved Hide resolved

cpoerschke removed their request for review January 28, 2025 17:55

address review comments.

4ad621e

github-actions bot added the tests label Jan 28, 2025

fix segment terminate early flag

87f146d

gus-asf requested changes Jan 30, 2025

View reviewed changes

sijuv added 3 commits March 6, 2025 10:33

Merge branch 'apache:main' into main

a4f7301

fix test case for using the maxHitsTerminateEarly flag instead of ter…

36faf4e

…minateEarly

Merge branch 'apache:main' into main

2cc1f29

HoustonPutman reviewed Mar 31, 2025

View reviewed changes

solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java Outdated Show resolved Hide resolved

HoustonPutman added 6 commits March 31, 2025 14:54

Speed up ETC for non-parallel case, and have the chunk size change ba…

42bf3f1

…sed on inputs

Do some renaming and future proof different early termination types

6a0833f

Fix bug

d90dbca

Refactor, add test, add partialResultsDetails

e291823

Fix typo

e95dcca

Rename maxHits to maxHitsAllowed. Remove single use of terminateEarly…

c770f1f

…, start counting approximateHits

HoustonPutman added 4 commits April 2, 2025 14:47

Add test for approximate total hits

fa068bc

Fix collation for no response header

1fa592f

Fix collation not having response header

b854168

Remove check

ca3abf2

Collapse queries do collection in complete(), so do that within the t…

e2e3c5f

…ry block

HoustonPutman changed the title ~~SOLR-17447 : Support to early terminate a search based on maxHits per collector.~~ SOLR-17447 : Support to early terminate a search based on maxHitsAllowed per shard. Apr 3, 2025

HoustonPutman added 3 commits April 10, 2025 09:52

Merge remote-tracking branch 'apache/main' into pr/2960

86b1768

Merge remote-tracking branch 'apache/main' into pr/2960

2ee71b1

Add a Changelog entry

05a1f88

HoustonPutman merged commit 900bf3d into apache:main Apr 10, 2025
4 checks passed

cpoerschke added a commit that referenced this pull request May 2, 2025

SOLR-17447: fix typo in solr/CHANGES.txt

81d0f6b

#2960 follow-up

asfgit pushed a commit that referenced this pull request May 2, 2025

SOLR-17447: fix typo in solr/CHANGES.txt

36eda00

#2960 follow-up (cherry picked from commit 81d0f6b)

aruggero pushed a commit to SeaseLtd/solr that referenced this pull request May 7, 2025

SOLR-17447: fix typo in solr/CHANGES.txt

6f2e798

apache#2960 follow-up

mlbiscoc pushed a commit to mlbiscoc/solr that referenced this pull request May 14, 2025

SOLR-17447: fix typo in solr/CHANGES.txt

430a197

apache#2960 follow-up

SOLR-17447 : Support to early terminate a search based on maxHitsAllowed per shard. #2960

SOLR-17447 : Support to early terminate a search based on maxHitsAllowed per shard. #2960

Uh oh!

Conversation

sijuv commented Jan 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Solution

Tests

Checklist

Uh oh!

dsmiley left a comment

Choose a reason for hiding this comment

Uh oh!

sijuv commented Jan 9, 2025

Uh oh!

sijuv commented Jan 22, 2025

Uh oh!

dsmiley commented Jan 23, 2025

Uh oh!

dsmiley left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dsmiley commented Jan 23, 2025

Uh oh!

Uh oh!

sijuv commented Jan 28, 2025

Uh oh!

sijuv commented Jan 28, 2025

Uh oh!

gus-asf commented Jan 30, 2025

Uh oh!

gus-asf left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HoustonPutman commented Apr 2, 2025

Uh oh!

HoustonPutman commented Apr 3, 2025

Uh oh!

HoustonPutman commented Apr 4, 2025

Uh oh!

dsmiley commented Apr 4, 2025

Uh oh!

sijuv commented Jan 7, 2025 •

edited

Loading