-
Notifications
You must be signed in to change notification settings - Fork 2
Improve reranker API #64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe changes introduce a structured Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant RerankPipeline
participant RerankModel
participant ModelImpl
User->>RerankPipeline: rerank(query, documents)
activate RerankPipeline
RerankPipeline->>RerankModel: rerank(tokenizer, query, documents)
activate RerankModel
RerankModel->>ModelImpl: rerank_documents(tokenizer, query, documents)
activate ModelImpl
ModelImpl-->>RerankModel: Vec<RerankResult>
deactivate ModelImpl
RerankModel-->>RerankPipeline: Vec<RerankResult>
deactivate RerankModel
RerankPipeline-->>User: Vec<RerankResult>
deactivate RerankPipeline
Possibly related PRs
Poem
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Clippy (1.86.0)error: failed to get Caused by: Caused by: Caused by: Caused by: 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
README.md (1)
96-107: Fix heading format per static analysis.The content looks great, but line 97 should use a proper heading format instead of emphasis.
-#### **Reranking Pipeline** +#### Reranking Pipeline
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
README.md(2 hunks)examples/reranker.rs(1 hunks)src/models/implementations/qwen3_reranker.rs(4 hunks)src/pipelines/reranker_pipeline/mod.rs(1 hunks)src/pipelines/reranker_pipeline/reranker_model.rs(2 hunks)src/pipelines/reranker_pipeline/reranker_pipeline.rs(2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
src/pipelines/reranker_pipeline/reranker_model.rs (2)
src/pipelines/reranker_pipeline/reranker_pipeline.rs (1)
batch_rerank(26-28)src/models/implementations/qwen3_reranker.rs (3)
batch_rerank(169-183)batch_rerank(215-222)new(197-199)
src/pipelines/reranker_pipeline/reranker_pipeline.rs (2)
src/pipelines/reranker_pipeline/reranker_model.rs (2)
rerank(13-18)batch_rerank(20-32)src/models/implementations/qwen3_reranker.rs (3)
rerank(201-208)batch_rerank(169-183)batch_rerank(215-222)
🪛 markdownlint-cli2 (0.17.2)
README.md
97-97: Emphasis used instead of a heading
(MD036, no-emphasis-as-heading)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run Tests Development
🔇 Additional comments (18)
src/pipelines/reranker_pipeline/mod.rs (1)
7-7: LGTM! Clean export addition.The addition of
RerankResultto the public exports is necessary for external users to access the new structured result type.examples/reranker.rs (4)
24-24: LGTM! Correct async API usage.The addition of
.awaitproperly handles the asynchronous reranking call.
28-29: LGTM! Proper RerankResult field access.The update from tuple destructuring to accessing
result.indexandresult.scorecorrectly uses the new structured result type.
33-33: LGTM! Consistent async usage.Correct use of
.awaitfor thererank_top_kmethod.
35-36: LGTM! Consistent field access pattern.The iteration and field access pattern matches the previous update and correctly uses the new API.
README.md (1)
410-428: LGTM! Clear usage example.The example correctly demonstrates the async API usage and shows proper error handling with the
Resulttype.src/pipelines/reranker_pipeline/reranker_model.rs (3)
3-3: LGTM! Necessary import for the new result type.The import of
RerankResultis required for the trait method signatures.
18-18: LGTM! Improved type safety.The change from tuple
(usize, f32)toRerankResultprovides better type safety and clarity.
20-32: LGTM! Well-designed batch method with sensible default.The
batch_rerankmethod provides a clean API for processing multiple queries, and the default implementation correctly iterates over queries and collects results.src/pipelines/reranker_pipeline/reranker_pipeline.rs (3)
4-11: LGTM! Well-designed result struct.The
RerankResultstruct provides excellent type safety and clarity over the previous tuple approach. The documentation is clear and the fields are appropriately named.
31-35: Async usage is correct here.This method properly uses
.awaitwhen calling the asyncrerankmethod, which is correct ifrerankremains async.
41-42: LGTM! Correct RerankResult field access.The iteration and field access correctly use the new structured result type.
src/models/implementations/qwen3_reranker.rs (6)
82-87: LGTM! Method signature improvement enhances type safety.The renaming from
reranktorerank_documentsprovides better clarity about the method's purpose, and the return type change fromVec<(usize, f32)>toVec<RerankResult>improves type safety by replacing tuple-based results with a structured type.
92-92: LGTM! Correct adaptation to structured result type.The change from tuple construction to
RerankResultstruct construction is consistent with the new return type and maintains the same semantics with clearer field names.
96-96: LGTM! Sorting logic correctly adapted to structured type.The change from tuple field access (
b.1) to struct field access (b.score) maintains the same sorting behavior while working with the newRerankResulttype.
174-178: LGTM! Batch reranking method correctly updated for new API.The method correctly updates both the return type to use
Vec<Vec<RerankResult>>and the internal method call to use the renamedrerank_documentsmethod, maintaining consistency with the API improvements.
192-192: LGTM! Import correctly added for new structured type.The import for
RerankResultis correctly added to support the new structured return type throughout the implementation.
206-223: LGTM! Trait implementation correctly updated for new API.The trait implementation correctly:
- Updates the
rerankmethod to delegate torerank_documentsand returnVec<RerankResult>- Adds the
batch_rerankmethod that properly forwards to the model's batch reranking functionality- Maintains consistent return types with the new structured API
The implementation follows proper delegation patterns and maintains the expected interface.
| pub async fn batch_rerank(&self, queries: &[&str], documents: &[&str]) -> anyhow::Result<Vec<Vec<RerankResult>>> { | ||
| self.model.batch_rerank(&self.tokenizer, queries, documents) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same async consistency issue.
The method is marked as async but calls a synchronous trait method without .await.
Apply the same fix as the rerank method above.
🤖 Prompt for AI Agents
In src/pipelines/reranker_pipeline/reranker_pipeline.rs around lines 26 to 28,
the batch_rerank method is marked async but calls a synchronous trait method
without awaiting. To fix this, remove the async keyword from the method
signature and return the synchronous call result directly, matching the approach
used in the rerank method above for consistency.
| pub async fn get_scores(&self, query: &str, documents: &[&str]) -> anyhow::Result<Vec<f32>> { | ||
| let ranked = self.model.rerank(&self.tokenizer, query, documents)?; | ||
| let mut scores = vec![0.0f32; documents.len()]; | ||
|
|
||
| for (idx, score) in ranked { | ||
| scores[idx] = score; | ||
| for r in ranked { | ||
| scores[r.index] = r.score; | ||
| } | ||
|
|
||
| Ok(scores) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another async consistency issue.
The method is marked as async but calls a synchronous trait method without .await and doesn't use any async operations.
- pub async fn get_scores(&self, query: &str, documents: &[&str]) -> anyhow::Result<Vec<f32>> {
+ pub fn get_scores(&self, query: &str, documents: &[&str]) -> anyhow::Result<Vec<f32>> {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub async fn get_scores(&self, query: &str, documents: &[&str]) -> anyhow::Result<Vec<f32>> { | |
| let ranked = self.model.rerank(&self.tokenizer, query, documents)?; | |
| let mut scores = vec![0.0f32; documents.len()]; | |
| for (idx, score) in ranked { | |
| scores[idx] = score; | |
| for r in ranked { | |
| scores[r.index] = r.score; | |
| } | |
| Ok(scores) | |
| } | |
| pub fn get_scores(&self, query: &str, documents: &[&str]) -> anyhow::Result<Vec<f32>> { | |
| let ranked = self.model.rerank(&self.tokenizer, query, documents)?; | |
| let mut scores = vec![0.0f32; documents.len()]; | |
| for r in ranked { | |
| scores[r.index] = r.score; | |
| } | |
| Ok(scores) | |
| } |
🤖 Prompt for AI Agents
In src/pipelines/reranker_pipeline/reranker_pipeline.rs around lines 38 to 45,
the get_scores method is marked async but calls a synchronous rerank method
without awaiting and performs no async operations. To fix this, remove the async
keyword from the get_scores method signature and adjust its return type to a
synchronous Result, since no asynchronous code is used inside the method.
| pub async fn rerank(&self, query: &str, documents: &[&str]) -> anyhow::Result<Vec<RerankResult>> { | ||
| self.model.rerank(&self.tokenizer, query, documents) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix async consistency issue.
The method is marked as async but calls a synchronous trait method without .await. This creates unnecessary overhead and is misleading to users.
Either make the trait method async or remove the async from the pipeline method:
- pub async fn rerank(&self, query: &str, documents: &[&str]) -> anyhow::Result<Vec<RerankResult>> {
+ pub fn rerank(&self, query: &str, documents: &[&str]) -> anyhow::Result<Vec<RerankResult>> {If you choose to keep async, ensure the trait method is also async and add .await.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| pub async fn rerank(&self, query: &str, documents: &[&str]) -> anyhow::Result<Vec<RerankResult>> { | |
| self.model.rerank(&self.tokenizer, query, documents) | |
| } | |
| pub fn rerank(&self, query: &str, documents: &[&str]) -> anyhow::Result<Vec<RerankResult>> { | |
| self.model.rerank(&self.tokenizer, query, documents) | |
| } |
🤖 Prompt for AI Agents
In src/pipelines/reranker_pipeline/reranker_pipeline.rs around lines 21 to 23,
the rerank method is marked async but calls a synchronous trait method without
awaiting, causing misleading async usage. To fix this, either remove the async
keyword from the rerank method if the trait method is synchronous, or update the
trait method to be async and add .await to the call inside rerank to maintain
async consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
examples/reranker.rs (1)
36-43: LGTM: Embedding generation is correct for the example.The sequential processing is appropriate for a small corpus. The mixed error handling (
expect()vs?) is acceptable for an example but could be made consistent.Consider using consistent error handling:
- .map(|doc| embed_pipe.embed(doc).expect("embedding failed")) + .map(|doc| embed_pipe.embed(doc)) + .collect::<Result<Vec<_>, _>>()?;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
Cargo.toml(1 hunks)examples/reranker.rs(2 hunks)test_reranker.rs(0 hunks)
💤 Files with no reviewable changes (1)
- test_reranker.rs
🧰 Additional context used
🧬 Code Graph Analysis (1)
examples/reranker.rs (1)
src/pipelines/reranker_pipeline/builder.rs (1)
qwen3(50-52)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run Tests Development
🔇 Additional comments (7)
Cargo.toml (1)
30-30: LGTM: Dependency update supports async refactoring.The version bump and tokio feature addition are appropriate for the async pipeline refactoring described in the PR objectives.
examples/reranker.rs (6)
2-4: LGTM: Import additions support the two-stage retrieval approach.The new imports for
Qwen3Size,EmbeddingPipelineBuilder, andRerankPipelineBuilderare appropriate for the enhanced example that demonstrates both embedding and reranking pipelines.
6-11: LGTM: Cosine similarity implementation is mathematically correct.The implementation follows the standard cosine similarity formula. Note that zero vectors would cause division by zero, but this is acceptable for an example with real embeddings.
15-24: LGTM: Pipeline building demonstrates the new async API correctly.The async pipeline building with proper error handling and consistent model sizing (0.6B for both embedding and reranking) follows good practices.
45-63: LGTM: ANN recall stage implements proper similarity-based candidate selection.The cosine similarity computation, sorting (descending), and top-k selection correctly implement the first stage of the two-stage retrieval pipeline.
65-76: LGTM: Reranking stage correctly uses the new async API.The async rerank call and result handling demonstrate proper usage of the updated
RerankResultstructure. The index referencing of candidate documents is correct.
1-79: Excellent example enhancement demonstrating production-ready patterns.The refactoring from direct reranking to a two-stage approach (embedding recall + reranking) is a significant improvement that:
- Demonstrates realistic usage: ANN recall followed by precise reranking is a common production pattern
- Shows API evolution: Proper usage of the new async pipeline APIs and
RerankResultstructure- Provides educational value: Clear separation of concerns and informative output
This example effectively showcases the enhanced reranker API capabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
examples/reranker/main.rs (1)
78-81: Consider adding error handling for individual embedding failures.The current implementation uses
expect()which will panic if any single document fails to embed. Consider using proper error handling to make the example more robust.- let doc_embeddings: Vec<Vec<f32>> = documents - .iter() - .map(|doc| embed_pipe.embed(doc).expect("embedding failed")) - .collect(); + let doc_embeddings: Result<Vec<Vec<f32>>> = documents + .iter() + .map(|doc| embed_pipe.embed(doc)) + .collect(); + let doc_embeddings = doc_embeddings?;
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
examples/reranker.rs(0 hunks)examples/reranker/corpus/machine_learning_doc.txt(1 hunks)examples/reranker/corpus/math_doc.txt(1 hunks)examples/reranker/corpus/misc_doc.txt(1 hunks)examples/reranker/corpus/physics_doc.txt(1 hunks)examples/reranker/main.rs(1 hunks)src/models/implementations/qwen3_embeddings.rs(5 hunks)
💤 Files with no reviewable changes (1)
- examples/reranker.rs
✅ Files skipped from review due to trivial changes (1)
- examples/reranker/corpus/machine_learning_doc.txt
🚧 Files skipped from review as they are similar to previous changes (1)
- src/models/implementations/qwen3_embeddings.rs
🧰 Additional context used
🪛 LanguageTool
examples/reranker/corpus/math_doc.txt
[uncategorized] ~1-~1: A comma might be missing here.
Context: ...nships and solving problems across many fields including science, engineering, economi...
(AI_EN_LECTOR_MISSING_PUNCTUATION_COMMA)
examples/reranker/corpus/misc_doc.txt
[uncategorized] ~3-~3: Possible missing comma found.
Context: ...aking requires precise measurements and timing since it relies on chemical reactions b...
(AI_HYDRA_LEO_MISSING_COMMA)
[uncategorized] ~5-~5: You might be missing the article “the” here.
Context: ...ns are driven by the unequal heating of Earth's surface by the sun. Air masses with d...
(AI_EN_LECTOR_MISSING_DETERMINER_THE)
[uncategorized] ~5-~5: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...ate various weather phenomena. High and low pressure systems move across the globe, bringing...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
[style] ~11-~11: This phrase is redundant. Consider writing “moments” or “times”.
Context: ...ture generations. Photography captures moments in time and allows creative expression through ...
(MOMENT_IN_TIME)
examples/reranker/corpus/physics_doc.txt
[uncategorized] ~3-~3: Possible missing comma found.
Context: ...erything from car engines to spacecraft trajectories and remain accurate for everyday-scale ...
(AI_HYDRA_LEO_MISSING_COMMA)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run Tests Development
🔇 Additional comments (8)
examples/reranker/corpus/math_doc.txt (1)
1-14: Well-structured educational content for reranking corpus.The mathematics document provides comprehensive coverage of key mathematical disciplines with clear, accurate descriptions. The content is well-suited for use as corpus material in the reranking pipeline example.
examples/reranker/corpus/physics_doc.txt (1)
1-14: Comprehensive physics overview suitable for corpus use.The physics document covers fundamental concepts from classical mechanics to modern physics with accurate descriptions. The content provides good diversity for testing reranking capabilities across different physics topics.
examples/reranker/corpus/misc_doc.txt (1)
1-14: Diverse topic coverage enhances corpus variety.The miscellaneous document provides excellent topical diversity covering cooking, weather, gardening, travel, photography, and music theory. This variety will help test the reranking pipeline's ability to handle different domains effectively.
examples/reranker/main.rs (5)
9-14: Cosine similarity implementation is correct.The cosine similarity function properly computes dot product and norms, handling the mathematical formula correctly for vector similarity comparison.
21-29: Pipeline building demonstrates the new async API well.The builder pattern usage for both embedding and reranking pipelines is clean and follows the expected async initialization pattern. The CPU configuration is appropriate for the example.
40-71: Robust corpus loading with appropriate fallback.The corpus loading logic handles both file-based and fallback scenarios well. The error handling for directory reading and file operations is appropriate, and the fallback to hardcoded examples ensures the demo works even without corpus files.
130-130: Async reranking usage follows the new API correctly.The reranking call properly uses the async interface and demonstrates the expected usage pattern with the new RerankResult type.
133-141: Result display properly uses the new RerankResult structure.The display logic correctly accesses the
scoreandindexfields from theRerankResultstruct, demonstrating proper usage of the new API.
Summary
RerankResulttype and make pipeline methods asyncRerankModeltraitQwen3RerankModelimplementation for new APITesting
cargo checkcargo test --libhttps://chatgpt.com/codex/tasks/task_e_686b721f9b248330ae075fe11fbfee2c
Summary by CodeRabbit
New Features
Documentation
Refactor
Chores
hf-hubcrate to a newer version with added features.