refacto(rerank): rerank endpoint clean archi refactoring#905
Open
leoguillaume wants to merge 7 commits into
Open
refacto(rerank): rerank endpoint clean archi refactoring#905leoguillaume wants to merge 7 commits into
leoguillaume wants to merge 7 commits into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
The previous model forwarding path concentrated several responsibilities in infrastructure classes. The gateway selected providers, forwarded HTTP requests, adapted provider payloads, computed usage, updated metrics and relied on request context side effects. This made the forwarding flow hard to test in isolation and forced business rules to be spread across infrastructure objects.
The refactoring moves orchestration back to the use case layer. The use case now describes the complete business flow: load the user and router, validate access and router type, select a provider, build the endpoint adapter, compute prompt tokens, apply rate limits, forward the request, format the response, compute usage and log metrics. Infrastructure implementations remain replaceable details behind domain contracts.
This branch applies the pattern first to the rerank endpoint. The same structure is intended to be reused by the other model forwarding endpoints.
The new architecture follows the principles of clean architecture: the model forwarding use case contains the business logic and directly orchestrates the domain abstractions. Dependencies remain simple to call and understand: objects do not call each other implicitly, they expose focused contracts that are explicitly composed by the use case.
Major changes
CreateRerankUseCaseowns the forwarding sequence and calls each dependency explicitly. This keeps the business flow readable in one place and avoids hidden calls between provider, router, metrics and HTTP objects.HttpProviderClientonly sends an already formatted request to the selected provider and returns the raw provider response or a model error. It no longer owns endpoint selection, usage computation, metrics or rate limiting.build_adapterselects the right adapter from the source endpoint and provider type, while common usage computation stays in the base adapter.Breaking changes:
Check lists
Review checklist
Before requesting a review, please take a moment to confirm that the following aspects have been considered and addressed. This section helps ensure the PR is ready for review, safe to merge, and deployable. If any items are left unchecked, please add a brief explanation for context.
If
api/sql/models.pyhas been modified, please confirm that the following steps have been completed:Deployment checklist
For each of the following items, please confirm if the PR concerns the deployment of the changes:
If new or updated environment variables are required, please list them here, otherwise delete this part. If other special deployment steps are required, please describe them here, otherwise delete this part.
Additional Notes
Please provide any additional information or context that may be relevant to this PR, otherwise delete this part. This could be any specific areas you would like the reviewers to focus on during their review of this PR (complex logic, risky changes, performance-sensitive code, etc.)