-
Notifications
You must be signed in to change notification settings - Fork 2
Identify cleanup opportunities in codebase #74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Identify cleanup opportunities in codebase #74
Conversation
Co-authored-by: lucienthomas00 <[email protected]>
Co-authored-by: lucienthomas00 <[email protected]>
WalkthroughThis change introduces a shared builder trait, Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant PipelineBuilder
participant BasePipelineBuilder
participant ModelCache
participant DeviceResolver
participant Tokenizer
participant Pipeline
User->>PipelineBuilder: call build()
PipelineBuilder->>BasePipelineBuilder: (trait default) build()
BasePipelineBuilder->>DeviceResolver: resolve device
BasePipelineBuilder->>ModelCache: get/create model with cache key
BasePipelineBuilder->>Tokenizer: get tokenizer
BasePipelineBuilder->>Pipeline: construct pipeline
BasePipelineBuilder-->>User: return Pipeline
Possibly related PRs
Suggested labels
Poem
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Clippy (1.86.0)error: failed to get Caused by: Caused by: Caused by: Caused by: ✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/pipelines/utils/builder.rs (1)
92-94: Consider the visibility of struct fieldsThe
pub(crate)visibility limits external crates from directly accessing these fields. While this provides good encapsulation, consider if downstream users might need more flexibility in the future.If external extensibility becomes necessary, you could add public getter methods or reconsider the field visibility.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
BUILDER_REFACTORING_SUMMARY.md(1 hunks)SPRING_CLEANING_ANALYSIS.md(1 hunks)src/pipelines/embedding_pipeline/builder.rs(2 hunks)src/pipelines/fill_mask_pipeline/builder.rs(2 hunks)src/pipelines/reranker_pipeline/builder.rs(2 hunks)src/pipelines/sentiment_analysis_pipeline/builder.rs(2 hunks)src/pipelines/text_generation_pipeline/builder.rs(1 hunks)src/pipelines/utils/builder.rs(1 hunks)src/pipelines/utils/mod.rs(1 hunks)src/pipelines/zero_shot_classification_pipeline/builder.rs(2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/pipelines/sentiment_analysis_pipeline/builder.rs (5)
src/pipelines/utils/mod.rs (2)
build_cache_key(83-85)device(76-79)src/pipelines/fill_mask_pipeline/builder.rs (6)
options(35-37)device_request(39-41)create_model(43-45)new(12-17)get_tokenizer(47-49)construct_pipeline(51-53)src/pipelines/utils/builder.rs (6)
options(41-41)device_request(44-44)create_model(48-48)new(98-103)get_tokenizer(52-52)construct_pipeline(56-56)src/pipelines/sentiment_analysis_pipeline/pipeline.rs (1)
device(38-40)src/models/implementations/modernbert.rs (9)
device(710-712)device(778-780)device(875-877)device(1055-1057)device(1150-1152)device(1230-1232)get_tokenizer(750-758)get_tokenizer(774-776)get_tokenizer(1008-1016)
🪛 LanguageTool
SPRING_CLEANING_ANALYSIS.md
[style] ~60-~60: The word ‘Kinda’ is informal. Consider replacing it.
Context: ...ine_tests/tool_error_handling.rs:49` - "Kinda hacky" comment 2. **Debug Prints in Ex...
(KINDA)
[typographical] ~134-~134: If specifying a range, consider using an en dash instead of a hyphen.
Context: ...ls - Token processing functions with 4-5 levels of nesting 3. **Large Match Exp...
(HYPHEN_TO_EN)
[uncategorized] ~213-~213: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...docs --- ## Estimated Effort - High Priority Items: 2-3 weeks of focused developme...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
[typographical] ~213-~213: If specifying a range, consider using an en dash instead of a hyphen.
Context: ...ed Effort** - High Priority Items: 2-3 weeks of focused development - **Medium...
(HYPHEN_TO_EN)
[uncategorized] ~214-~214: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...*: 2-3 weeks of focused development - Medium Priority Items: 1-2 weeks additional - **Low P...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
[typographical] ~214-~214: If specifying a range, consider using an en dash instead of a hyphen.
Context: ...evelopment - Medium Priority Items: 1-2 weeks additional - **Low Priority Items...
(HYPHEN_TO_EN)
[uncategorized] ~215-~215: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...ority Items**: 1-2 weeks additional - Low Priority Items: 1 week additional Total: ...
(EN_COMPOUND_ADJECTIVE_INTERNAL)
[typographical] ~217-~217: If specifying a range, consider using an en dash instead of a hyphen.
Context: ... Items**: 1 week additional Total: 4-6 weeks for comprehensive cleanup while m...
(HYPHEN_TO_EN)
BUILDER_REFACTORING_SUMMARY.md
[typographical] ~28-~28: If specifying a range, consider using an en dash instead of a hyphen.
Context: ... - Removed duplicated build() method (12-15 lines each) - Implemented `BasePipeline...
(HYPHEN_TO_EN)
🪛 markdownlint-cli2 (0.17.2)
SPRING_CLEANING_ANALYSIS.md
11-11: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
34-34: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
56-56: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
78-78: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
102-102: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
125-125: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
147-147: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
170-170: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
192-192: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
198-198: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
204-204: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
BUILDER_REFACTORING_SUMMARY.md
11-11: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
34-34: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
56-56: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
78-78: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
102-102: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
125-125: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
147-147: Trailing punctuation in heading
Punctuation: ':'
(MD026, no-trailing-punctuation)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Run Tests Development
🔇 Additional comments (11)
src/pipelines/utils/mod.rs (1)
4-6: LGTM! Appropriate module structure for shared builder utilities.The re-export of builder utilities from the utils module provides a clean API for accessing the shared builder infrastructure.
src/pipelines/text_generation_pipeline/builder.rs (1)
10-15: Good documentation of architectural decision.The comment clearly explains why this builder is excluded from the shared pattern, which helps future maintainers understand the design rationale.
src/pipelines/sentiment_analysis_pipeline/builder.rs (1)
26-54: Clean implementation of the shared builder trait.The trait implementation correctly delegates to model-specific methods and maintains the same functionality as the previous explicit
build()method while reducing code duplication.src/pipelines/zero_shot_classification_pipeline/builder.rs (1)
26-54: Consistent implementation of the shared builder pattern.The trait implementation follows the same structure as other refactored builders, ensuring consistency across the codebase.
src/pipelines/reranker_pipeline/builder.rs (1)
27-58: Correct implementation with Arc wrapping for thread safety.The trait implementation properly handles the RerankPipeline's requirement for an Arc-wrapped model, maintaining the pipeline's thread-safe design while conforming to the shared builder pattern.
src/pipelines/fill_mask_pipeline/builder.rs (1)
26-54: Well-structured trait implementation!The implementation correctly adopts the shared builder pattern, properly defining all required associated types and methods. The trait bounds are appropriate for concurrent usage and caching.
src/pipelines/embedding_pipeline/builder.rs (1)
52-57: Correct pipeline-specific implementation!The Arc wrapper around the model is appropriately used here, likely due to EmbeddingPipeline's specific requirements for model sharing or cloning. This demonstrates good flexibility in the shared pattern.
SPRING_CLEANING_ANALYSIS.md (1)
193-194: Analysis directly addresses the current refactoring!Great to see that "Extract Builder Pattern Common Code" is correctly identified as a high-priority item. This PR successfully implements this recommendation by introducing the
BasePipelineBuildertrait.src/pipelines/utils/builder.rs (1)
27-84: Excellent trait design with clear separation of concerns!The
BasePipelineBuildertrait effectively captures the common pipeline building pattern with well-defined extension points. The defaultbuild()implementation successfully consolidates the duplicated logic while maintaining flexibility through the associated methods.BUILDER_REFACTORING_SUMMARY.md (2)
79-83: Accurate quantification of code reduction!The ~60% reduction claim is well-supported by the actual changes. This significant reduction in duplication will greatly improve maintainability and consistency across pipeline builders.
112-112: Documentation Verified for Text Generation Builder ExclusionThe added doc comment in
src/pipelines/text_generation_pipeline/builder.rsclearly explains why this builder doesn’t use the sharedBasePipelineBuildertrait—namely its more complex build pattern (multiple config options like temperature/top_p), async model creation, and distinct caching logic. No further changes are needed here.
Refactor pipeline builders to use a shared trait, eliminating duplicate build logic and centralizing common build patterns.
Summary by CodeRabbit
Refactor
Documentation