enhance: add boost score support#50372
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: junjiejiangjjj The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@junjiejiangjjj Please associate the related issue to the body of your Pull Request. (eg. "issue: #") |
|
[ci-v2-notice] To rerun ci-v2 checks, comment with:
If you have any questions or requests, please contact @zhikunyao. |
| } | ||
| } | ||
|
|
||
| if len(values) == 0 { |
There was a problem hiding this comment.
This PR changes score_combine's null handling in processChunk: previously any null input nulled the whole row; now null inputs are skipped and the row is only nulled when every input is null. ScoreCombineExpr is shared with the pre-existing decay reranker, which combines $score with the decay factor via NewScoreCombineExpr(ModeMultiply) (rerank_builder.go:512) and where DecayExpr emits a null factor for null field values (decay_expr.go:259). As a result, decay over a nullable numeric field now returns the raw $score (decay effectively ignored) where it previously produced a null score, silently changing decay ranking. Scope the null-skip to the boost path, or add decay-on-nullable coverage to confirm the change is intended.
| // Both null — use tie-break if available | ||
| less, useTypedLess := makeArrayRowLess(sortChunk, tbChunk, o.desc) | ||
| if useTypedLess { | ||
| sort.Slice(indices, func(i, j int) bool { |
There was a problem hiding this comment.
The new typed fast path sorts with unstable sort.Slice, whereas the pre-PR code used a single sort.SliceStable, and its comparator only tie-breaks on $id, not $element_indices/$offset. In element-level search one entity can produce multiple rows that share $id and an equal boosted $score; the unstable sort no longer preserves their input order, so which element/offset survives the downstream merge can change relative to the prior stable behavior. (Go's sort.Slice is deterministic for identical input, so this is a loss of the stable-ordering guarantee rather than true run-to-run randomness.) Use sort.SliceStable, or extend the comparator with a full tie-break on $element_indices/$offset.
| type boostScoreFunc func(context.Context, segments.Segment, *segcore.SearchRequest, *planpb.ScoreFunction, *arrow.Chunked) (*arrow.Chunked, error) | ||
|
|
||
| func newSegmentBoostScoreRunner(scoreFunc boostScoreFunc, segment segments.Segment, searchReq *segcore.SearchRequest, scorer *planpb.ScoreFunction) expr.BoostScoreRunner { | ||
| return func(ctx context.Context, offsets *arrow.Chunked) (*arrow.Chunked, error) { |
There was a problem hiding this comment.
The single-segment boost path scores synchronously through a C ABI that has no cancellation token, so threading a context here can't actually interrupt the in-flight work — the dropped context is a symptom, not the root cause. A caller that cancels or hits its deadline (e.g. client disconnect) still blocks until the single-segment boost completes; the durable fix is to route single-segment boost through the async API. Low severity since the topK-bounded workload keeps the uninterruptible window short.
✅ CI Loop Results
|
| Stage | Result | Duration | Tests |
|---|---|---|---|
| ✅ Build | SUCCESS | 10.3min | - |
| ✅ Code-Check | SUCCESS | 8.1min | - |
| ✅ UT-GO | SUCCESS | 22.8min | 1020 passed |
| ✅ UT-Integration | SUCCESS | 24.3min | 46 passed |
| ✅ UT-CPP-Cov | SUCCESS | 36.7min | 7841 passed |
Total: 80min | Pipeline | Artifacts
Overall Coverage: 71.2%
Diff Coverage: CPP 68.3% (125 hit, 58 miss, 183 measurable lines, 779 unmeasured) | Go 53.5% (285 hit, 248 miss, 533 measurable lines, 401 unmeasured)
Diff Coverage HTML: view changed lines
Total Patch Coverage: 57.3% (410/716 measurable lines, 1180 unmeasured)
Codecov Report❌ Patch coverage is Please upload reports for the commit 8a7386d to get more accurate results. ❌ Your patch check has failed because the patch coverage (60.42%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## master #50372 +/- ##
==========================================
- Coverage 78.95% 78.90% -0.06%
==========================================
Files 2239 2246 +7
Lines 396935 397635 +700
==========================================
+ Hits 313401 313740 +339
- Misses 73947 74293 +346
- Partials 9587 9602 +15
🚀 New features to boost your workflow:
|
Implement boost score evaluation for the Go search reduce pipeline, including QueryNode task integration, segcore C API bindings, and score expression combination support. Add boost score runner logic in core, expose segment-level boost scoring to Go, and cover the new behavior with C++, Go, and Python client tests. Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com>
Implement boost score evaluation for the Go search reduce pipeline, including
QueryNode task integration, segcore C API bindings, and score expression
combination support.
Add boost score runner logic in core, expose segment-level boost scoring to Go,
and cover the new behavior with C++, Go, and Python client tests.