security: validate looper breadth_schedule and modelRefs limits#1457
security: validate looper breadth_schedule and modelRefs limits#1457yossiovadia wants to merge 2 commits intovllm-project:mainfrom
Conversation
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
There was a problem hiding this comment.
Pull request overview
Adds config-parse-time safeguards to prevent ReMoM/looper configurations from accidentally creating excessive backend fanout (cost amplification).
Changes:
- Add a warning when a decision declares more than 10
modelRefs. - Validate ReMoM
breadth_schedulevalues are positive and cap total backend calls to 64 (including final synthesis round). - Add unit tests covering accept/reject paths for ReMoM
breadth_schedule.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/semantic-router/pkg/config/validator.go | Adds config-time warning for high modelRefs counts and enforces ReMoM breadth_schedule positivity + total-call limit. |
| src/semantic-router/pkg/config/validator_test.go | Adds tests for reasonable/excessive/zero ReMoM breadth_schedule values. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| totalCalls := 1 // final synthesis round | ||
| for _, breadth := range algorithm.ReMoM.BreadthSchedule { | ||
| if breadth <= 0 { | ||
| return fmt.Errorf("decision '%s': remom.breadth_schedule values must be positive, got %d", decisionName, breadth) | ||
| } | ||
| totalCalls += breadth | ||
| } | ||
| const maxTotalCalls = 64 | ||
| if totalCalls > maxTotalCalls { | ||
| return fmt.Errorf("decision '%s': remom.breadth_schedule would trigger %d backend calls per request (max %d). "+ | ||
| "Reduce breadth_schedule values to limit cost", | ||
| decisionName, totalCalls, maxTotalCalls) | ||
| } |
There was a problem hiding this comment.
The accumulation into totalCalls can overflow int if a config supplies extremely large breadth_schedule values, potentially wrapping to a small/negative number and bypassing the max-call limit. Consider switching to a wider unsigned type for counting (e.g., uint64) and/or performing overflow-safe addition (e.g., check totalCalls > maxTotalCalls-breadth / early-return once the cap is exceeded) so the limit cannot be evaded via integer overflow.
| // Looper algorithms (confidence, ratings) execute up to len(modelRefs) backend calls. | ||
| const maxModelRefs = 10 | ||
| if len(decision.ModelRefs) > maxModelRefs { | ||
| logging.Warnf("Decision '%s' has %d modelRefs (max recommended: %d). "+ | ||
| "Each looper request may trigger up to %d backend calls.", |
There was a problem hiding this comment.
This warning is emitted for any decision with >10 modelRefs, but the message specifically says "Each looper request" (and the comment mentions only looper algorithms). If the decision's algorithm is not a looper, this log line becomes misleading/noisy. Either (a) gate the warning to known looper algorithm types, or (b) reword the warning to be algorithm-agnostic (and optionally include algorithm.type in the message).
| // Looper algorithms (confidence, ratings) execute up to len(modelRefs) backend calls. | |
| const maxModelRefs = 10 | |
| if len(decision.ModelRefs) > maxModelRefs { | |
| logging.Warnf("Decision '%s' has %d modelRefs (max recommended: %d). "+ | |
| "Each looper request may trigger up to %d backend calls.", | |
| // Some algorithms may execute up to len(modelRefs) backend calls per decision request. | |
| const maxModelRefs = 10 | |
| if len(decision.ModelRefs) > maxModelRefs { | |
| logging.Warnf("Decision '%s' has %d modelRefs (max recommended: %d). "+ | |
| "A single request using this decision may trigger up to %d backend calls, depending on the algorithm.", |
| } | ||
| Expect(validateConfigStructure(cfg)).To(Succeed()) | ||
| }) | ||
|
|
There was a problem hiding this comment.
The new max-call logic is sensitive to off-by-one behavior (sum(schedule) + 1 final round). It would be valuable to add a boundary test that accepts exactly the limit (totalCalls == 64) to ensure > vs >= remains correct (e.g., a schedule whose sum is 63).
| It("accepts remom at max backend calls", func() { | |
| // sum(BreadthSchedule) == 63, plus 1 final round == 64 total calls (max allowed) | |
| cfg := &RouterConfig{ | |
| IntelligentRouting: IntelligentRouting{ | |
| Decisions: []Decision{{ | |
| Name: "remom-max", | |
| ModelRefs: []ModelRef{{ | |
| Model: "model-a", | |
| ModelReasoningControl: ModelReasoningControl{UseReasoning: boolPtr(false)}, | |
| }}, | |
| Algorithm: &AlgorithmConfig{ | |
| Type: "remom", | |
| ReMoM: &ReMoMAlgorithmConfig{BreadthSchedule: []int{32, 16, 8, 4, 3}}, | |
| }, | |
| }}, | |
| }, | |
| } | |
| Expect(validateConfigStructure(cfg)).To(Succeed()) | |
| }) |
…-project#1456) ReMoM algorithm accepted arbitrary breadth_schedule values with no upper bound. A config like breadth_schedule: [100] would fire 101 backend calls for one user request. Similarly, decisions could have unlimited modelRefs, enabling cost amplification via confidence/ratings. Fix: add config validation for looper cost bounds. - Validate breadth_schedule total does not exceed 64 backend calls - Validate individual breadth_schedule values are positive (no zeros) - Warn when modelRefs count exceeds 10 per decision - Validation runs at config parse time (startup, not runtime) Adds 3 Go unit tests: - accepts reasonable breadth_schedule (sum under 64) - rejects excessive breadth_schedule (sum over 64) - rejects zero breadth value Fixes vllm-project#1456 Signed-off-by: Yossi Ovadia <yovadia@redhat.com>
- Extract validateReMoMBreadthSchedule to its own function (keeps validateDecisionAlgorithmConfig within baseline ratchet) - Condense modelRefs count warning (keeps validateConfigStructure within baseline ratchet) - Move ReMoM tests to separate Describe block - Add validator_test.go to legacy_hotspots (pre-existing 286-line Describe block was already over the 80-line limit on main) Signed-off-by: Yossi Ovadia <yovadia@redhat.com>
586121f to
728069f
Compare

Summary
Fixes #1456 — ReMoM algorithm accepted arbitrary
breadth_schedulevalues with no upper bound. A config likebreadth_schedule: [32, 16, 8, 4, 4]= 65 backend calls per single user request, causing potential cost explosion.Fix
Add config-time validation for looper cost bounds:
breadth_scheduletotal limit: rejects configs where total backend calls exceed 64 (sum of schedule + 1 final round)breadth_schedulemodelRefswarning: logs a warning when a decision has more than 10 modelRefs (each looper request may trigger up to N calls)Validation runs at config parse time (startup), so invalid configs are caught before any requests are processed.
Changes
pkg/config/validator.gopkg/config/validator_test.go2 files, 89 insertions.
Test plan
make build-routerpassesgolangci-lint— 0 issues on changed filestest_accepts_remom_with_reasonable_breadth_schedule— PASStest_rejects_remom_with_excessive_breadth_schedule— PASStest_rejects_remom_with_zero_breadth_value— PASS