Skip to content

feat: Support vllm and tei rerank #41947

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 28, 2025

Conversation

junjiejiangjjj
Copy link
Contributor

@junjiejiangjjj junjiejiangjjj commented May 20, 2025

@sre-ci-robot sre-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines. label May 20, 2025
@mergify mergify bot added the dco-passed DCO check passed. label May 20, 2025
Copy link
Contributor

mergify bot commented May 20, 2025

@junjiejiangjjj

Invalid PR Title Format Detected

Your PR submission does not adhere to our required standards. To ensure clarity and consistency, please meet the following criteria:

  1. Title Format: The PR title must begin with one of these prefixes:
  • feat: for introducing a new feature.
  • fix: for bug fixes.
  • enhance: for improvements to existing functionality.
  • test: for add tests to existing functionality.
  • doc: for modifying documentation.
  • auto: for the pull request from bot.
  1. Description Requirement: The PR must include a non-empty description, detailing the changes and their impact.

Required Title Structure:

[Type]: [Description of the PR]

Where Type is one of feat, fix, enhance, test or doc.

Example:

enhance: improve search performance significantly 

Please review and update your PR to comply with these guidelines.

@junjiejiangjjj junjiejiangjjj changed the title Support vllm and tei rerank feat: Support vllm and tei rerank May 20, 2025
@mergify mergify bot added kind/feature Issues related to feature request from users and removed do-not-merge/invalid-pr-format labels May 20, 2025
Copy link
Contributor

mergify bot commented May 20, 2025

@junjiejiangjjj Please associate the related issue to the body of your Pull Request. (eg. “issue: #”)

Copy link

codecov bot commented May 20, 2025

Codecov Report

Attention: Patch coverage is 88.23529% with 36 lines in your changes missing coverage. Please review.

Project coverage is 80.50%. Comparing base (79b51cb) to head (784d656).
Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
internal/util/function/rerank/model_function.go 90.45% 14 Missing and 7 partials ⚠️
pkg/util/paramtable/function_param.go 68.00% 7 Missing and 1 partial ⚠️
internal/proxy/task_search.go 76.19% 3 Missing and 2 partials ⚠️
internal/util/function/rerank/function_score.go 33.33% 2 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #41947      +/-   ##
==========================================
+ Coverage   80.47%   80.50%   +0.02%     
==========================================
  Files        1537     1538       +1     
  Lines      217717   217972     +255     
==========================================
+ Hits       175217   175469     +252     
+ Misses      36194    36189       -5     
- Partials     6306     6314       +8     
Components Coverage Δ
Client 79.36% <ø> (ø)
Core 72.98% <ø> (+0.02%) ⬆️
Go 81.98% <88.23%> (+0.02%) ⬆️
Files with missing lines Coverage Δ
internal/util/function/common.go 91.80% <ø> (ø)
...unction/models/ali/ali_dashscope_text_embedding.go 87.75% <100.00%> (ø)
...il/function/models/cohere/cohere_text_embedding.go 87.23% <100.00%> (ø)
...al/util/function/models/openai/openai_embedding.go 85.71% <100.00%> (ø)
...n/models/siliconflow/siliconflow_text_embedding.go 87.50% <100.00%> (ø)
internal/util/function/models/tei/tei.go 84.21% <100.00%> (ø)
...ernal/util/function/models/utils/embedding_util.go 82.35% <100.00%> (+1.10%) ⬆️
...unction/models/vertexai/vertexai_text_embedding.go 63.23% <100.00%> (ø)
...unction/models/voyageai/voyageai_text_embedding.go 85.71% <100.00%> (ø)
internal/util/function/rerank/decay_function.go 98.48% <100.00%> (-0.13%) ⬇️
... and 7 more

... and 26 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

mergify bot commented May 22, 2025

@junjiejiangjjj E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

Copy link
Contributor

mergify bot commented May 22, 2025

@junjiejiangjjj cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

Copy link
Contributor

mergify bot commented May 22, 2025

@junjiejiangjjj E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@junjiejiangjjj
Copy link
Contributor Author

/run-cpu-e2e

Copy link
Contributor

mergify bot commented May 22, 2025

@junjiejiangjjj cpp-unit-test check failed, comment rerun cpp-unit-test can trigger the job again.

@junjiejiangjjj
Copy link
Contributor Author

rerun cpp-unit-test

Copy link
Contributor

mergify bot commented May 22, 2025

@junjiejiangjjj E2e jenkins job failed, comment /run-cpu-e2e can trigger the job again.

@junjiejiangjjj
Copy link
Contributor Author

/run-cpu-e2e

@mergify mergify bot added the ci-passed label May 23, 2025
}
if fields, err := t.reorganizeRequeryResults(ctx, queryResult.GetFieldsData(), []*schemapb.IDs{t.result.Results.Ids}); err != nil {
return err
} else {
t.result.Results.FieldsData = fields[0]
}
} else {
ctx, sp := otel.Tracer(typeutil.ProxyRole).Start(ctx, "Proxy-call-rerank-function-udf")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: dup code in the if-else block, use an anonamous function for this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

inputType := schemapb.DataType_None
for _, field := range collSchema.Fields {
if field.Name == base.GetInputFieldNames()[0] {
inputType = field.DataType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we don't store data type in the base

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

}
}
if inputType != schemapb.DataType_VarChar {
return nil, fmt.Errorf("Rerank model only support varchar, bug got [%s]", inputType.String())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only support varchar based reranking?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rerank model only supports text, and currently, milvus can only use varchar storage

headers := map[string]string{
"Content-Type": "application/json",
}
body, err := utils.RetrySend(ctx, requestBody, http.MethodPost, base.url, headers, 3, 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add an exponential sleep time for retry

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +344 to +362
for _, col := range cols {
texts := col.data[0].([]string)
ids := col.ids.([]T)
for idx, id := range ids {
if _, ok := uniqueData[id]; !ok {
uniqueData[id] = texts[idx]
}
}
}
ids := make([]T, 0, len(uniqueData))
texts := make([]string, 0, len(uniqueData))
for id, text := range uniqueData {
ids = append(ids, id)
texts = append(texts, text)
}
scores, err := model.provider.rerank(ctx, query, texts)
if err != nil {
return nil, err
}

rerankScores := map[T]float32{}
for idx, id := range ids {
rerankScores[id] = scores[idx]
}
return newIDScores(rerankScores, searchParams), nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we only do rescore instead of reranking?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rerank model re-scores based on the query and doc content

Comment on lines +344 to +345
for _, col := range cols {
texts := col.data[0].([]string)
ids := col.ids.([]T)
for idx, id := range ids {
if _, ok := uniqueData[id]; !ok {
uniqueData[id] = texts[idx]
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we will have dup here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reduce token consumption of the model

@mergify mergify bot removed the ci-passed label May 28, 2025
@junjiejiangjjj junjiejiangjjj force-pushed the vllm-tei branch 2 times, most recently from 57eb010 to b03cfe1 Compare May 28, 2025 05:18
Signed-off-by: junjie.jiang <[email protected]>
Copy link
Contributor

mergify bot commented May 28, 2025

@junjiejiangjjj go-sdk check failed, comment rerun go-sdk can trigger the job again.

@junjiejiangjjj
Copy link
Contributor Author

rerun go-sdk

@mergify mergify bot added the ci-passed label May 28, 2025
Copy link
Member

@liliu-z liliu-z left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@sre-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: junjiejiangjjj, liliu-z

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sre-ci-robot sre-ci-robot merged commit 4202c77 into milvus-io:master May 28, 2025
20 checks passed
@junjiejiangjjj junjiejiangjjj deleted the vllm-tei branch May 28, 2025 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved area/test ci-passed dco-passed DCO check passed. kind/feature Issues related to feature request from users lgtm sig/testing size/XXL Denotes a PR that changes 1000+ lines.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants