-
Notifications
You must be signed in to change notification settings - Fork 3.2k
feat: Support vllm and tei rerank #41947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Invalid PR Title Format Detected Your PR submission does not adhere to our required standards. To ensure clarity and consistency, please meet the following criteria:
Required Title Structure:
Where Example:
Please review and update your PR to comply with these guidelines. |
@junjiejiangjjj Please associate the related issue to the body of your Pull Request. (eg. “issue: #”) |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #41947 +/- ##
==========================================
+ Coverage 80.47% 80.50% +0.02%
==========================================
Files 1537 1538 +1
Lines 217717 217972 +255
==========================================
+ Hits 175217 175469 +252
+ Misses 36194 36189 -5
- Partials 6306 6314 +8
🚀 New features to boost your workflow:
|
@junjiejiangjjj E2e jenkins job failed, comment |
@junjiejiangjjj cpp-unit-test check failed, comment |
@junjiejiangjjj E2e jenkins job failed, comment |
/run-cpu-e2e |
@junjiejiangjjj cpp-unit-test check failed, comment |
rerun cpp-unit-test |
@junjiejiangjjj E2e jenkins job failed, comment |
/run-cpu-e2e |
internal/proxy/task_search.go
Outdated
} | ||
if fields, err := t.reorganizeRequeryResults(ctx, queryResult.GetFieldsData(), []*schemapb.IDs{t.result.Results.Ids}); err != nil { | ||
return err | ||
} else { | ||
t.result.Results.FieldsData = fields[0] | ||
} | ||
} else { | ||
ctx, sp := otel.Tracer(typeutil.ProxyRole).Start(ctx, "Proxy-call-rerank-function-udf") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: dup code in the if-else
block, use an anonamous function for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
inputType := schemapb.DataType_None | ||
for _, field := range collSchema.Fields { | ||
if field.Name == base.GetInputFieldNames()[0] { | ||
inputType = field.DataType |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we don't store data type in the base
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
} | ||
} | ||
if inputType != schemapb.DataType_VarChar { | ||
return nil, fmt.Errorf("Rerank model only support varchar, bug got [%s]", inputType.String()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We only support varchar based reranking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rerank model only supports text, and currently, milvus can only use varchar storage
headers := map[string]string{ | ||
"Content-Type": "application/json", | ||
} | ||
body, err := utils.RetrySend(ctx, requestBody, http.MethodPost, base.url, headers, 3, 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add an exponential sleep time for retry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
for _, col := range cols { | ||
texts := col.data[0].([]string) | ||
ids := col.ids.([]T) | ||
for idx, id := range ids { | ||
if _, ok := uniqueData[id]; !ok { | ||
uniqueData[id] = texts[idx] | ||
} | ||
} | ||
} | ||
ids := make([]T, 0, len(uniqueData)) | ||
texts := make([]string, 0, len(uniqueData)) | ||
for id, text := range uniqueData { | ||
ids = append(ids, id) | ||
texts = append(texts, text) | ||
} | ||
scores, err := model.provider.rerank(ctx, query, texts) | ||
if err != nil { | ||
return nil, err | ||
} | ||
|
||
rerankScores := map[T]float32{} | ||
for idx, id := range ids { | ||
rerankScores[id] = scores[idx] | ||
} | ||
return newIDScores(rerankScores, searchParams), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So we only do rescore instead of reranking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rerank model re-scores based on the query and doc content
for _, col := range cols { | ||
texts := col.data[0].([]string) | ||
ids := col.ids.([]T) | ||
for idx, id := range ids { | ||
if _, ok := uniqueData[id]; !ok { | ||
uniqueData[id] = texts[idx] | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we will have dup here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reduce token consumption of the model
57eb010
to
b03cfe1
Compare
Signed-off-by: junjie.jiang <[email protected]>
@junjiejiangjjj go-sdk check failed, comment |
rerun go-sdk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: junjiejiangjjj, liliu-z The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
#35856