Skip to content

security: validate looper breadth_schedule and modelRefs limits#1457

Open
yossiovadia wants to merge 2 commits intovllm-project:mainfrom
yossiovadia:fix/looper-config-validation
Open

security: validate looper breadth_schedule and modelRefs limits#1457
yossiovadia wants to merge 2 commits intovllm-project:mainfrom
yossiovadia:fix/looper-config-validation

Conversation

@yossiovadia
Copy link
Copy Markdown
Collaborator

Summary

Fixes #1456 — ReMoM algorithm accepted arbitrary breadth_schedule values with no upper bound. A config like breadth_schedule: [32, 16, 8, 4, 4] = 65 backend calls per single user request, causing potential cost explosion.

Fix

Add config-time validation for looper cost bounds:

  • breadth_schedule total limit: rejects configs where total backend calls exceed 64 (sum of schedule + 1 final round)
  • Positive values: rejects zero or negative values in breadth_schedule
  • modelRefs warning: logs a warning when a decision has more than 10 modelRefs (each looper request may trigger up to N calls)

Validation runs at config parse time (startup), so invalid configs are caught before any requests are processed.

Changes

File Change
pkg/config/validator.go Add ReMoM breadth_schedule validation + modelRefs count warning
pkg/config/validator_test.go 3 unit tests: accepts reasonable, rejects excessive, rejects zero

2 files, 89 insertions.

Test plan

  • make build-router passes
  • golangci-lint — 0 issues on changed files
  • test_accepts_remom_with_reasonable_breadth_schedule — PASS
  • test_rejects_remom_with_excessive_breadth_schedule — PASS
  • test_rejects_remom_with_zero_breadth_value — PASS
  • All 283 config tests pass

@netlify
Copy link
Copy Markdown

netlify bot commented Mar 6, 2026

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 728069f
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/69bb026cbfb2420008f779ef
😎 Deploy Preview https://deploy-preview-1457--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 6, 2026

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 src

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

  • src/semantic-router/pkg/config/validator.go
  • src/semantic-router/pkg/config/validator_test.go

📁 tools

Owners: @yuluo-yx, @rootfs, @Xunzhuo
Files changed:

  • tools/agent/structure-rules.yaml

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds config-parse-time safeguards to prevent ReMoM/looper configurations from accidentally creating excessive backend fanout (cost amplification).

Changes:

  • Add a warning when a decision declares more than 10 modelRefs.
  • Validate ReMoM breadth_schedule values are positive and cap total backend calls to 64 (including final synthesis round).
  • Add unit tests covering accept/reject paths for ReMoM breadth_schedule.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/semantic-router/pkg/config/validator.go Adds config-time warning for high modelRefs counts and enforces ReMoM breadth_schedule positivity + total-call limit.
src/semantic-router/pkg/config/validator_test.go Adds tests for reasonable/excessive/zero ReMoM breadth_schedule values.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +371 to +383
totalCalls := 1 // final synthesis round
for _, breadth := range algorithm.ReMoM.BreadthSchedule {
if breadth <= 0 {
return fmt.Errorf("decision '%s': remom.breadth_schedule values must be positive, got %d", decisionName, breadth)
}
totalCalls += breadth
}
const maxTotalCalls = 64
if totalCalls > maxTotalCalls {
return fmt.Errorf("decision '%s': remom.breadth_schedule would trigger %d backend calls per request (max %d). "+
"Reduce breadth_schedule values to limit cost",
decisionName, totalCalls, maxTotalCalls)
}
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The accumulation into totalCalls can overflow int if a config supplies extremely large breadth_schedule values, potentially wrapping to a small/negative number and bypassing the max-call limit. Consider switching to a wider unsigned type for counting (e.g., uint64) and/or performing overflow-safe addition (e.g., check totalCalls > maxTotalCalls-breadth / early-return once the cap is exceeded) so the limit cannot be evaded via integer overflow.

Copilot uses AI. Check for mistakes.
Comment on lines +125 to +129
// Looper algorithms (confidence, ratings) execute up to len(modelRefs) backend calls.
const maxModelRefs = 10
if len(decision.ModelRefs) > maxModelRefs {
logging.Warnf("Decision '%s' has %d modelRefs (max recommended: %d). "+
"Each looper request may trigger up to %d backend calls.",
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This warning is emitted for any decision with >10 modelRefs, but the message specifically says "Each looper request" (and the comment mentions only looper algorithms). If the decision's algorithm is not a looper, this log line becomes misleading/noisy. Either (a) gate the warning to known looper algorithm types, or (b) reword the warning to be algorithm-agnostic (and optionally include algorithm.type in the message).

Suggested change
// Looper algorithms (confidence, ratings) execute up to len(modelRefs) backend calls.
const maxModelRefs = 10
if len(decision.ModelRefs) > maxModelRefs {
logging.Warnf("Decision '%s' has %d modelRefs (max recommended: %d). "+
"Each looper request may trigger up to %d backend calls.",
// Some algorithms may execute up to len(modelRefs) backend calls per decision request.
const maxModelRefs = 10
if len(decision.ModelRefs) > maxModelRefs {
logging.Warnf("Decision '%s' has %d modelRefs (max recommended: %d). "+
"A single request using this decision may trigger up to %d backend calls, depending on the algorithm.",

Copilot uses AI. Check for mistakes.
}
Expect(validateConfigStructure(cfg)).To(Succeed())
})

Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new max-call logic is sensitive to off-by-one behavior (sum(schedule) + 1 final round). It would be valuable to add a boundary test that accepts exactly the limit (totalCalls == 64) to ensure > vs >= remains correct (e.g., a schedule whose sum is 63).

Suggested change
It("accepts remom at max backend calls", func() {
// sum(BreadthSchedule) == 63, plus 1 final round == 64 total calls (max allowed)
cfg := &RouterConfig{
IntelligentRouting: IntelligentRouting{
Decisions: []Decision{{
Name: "remom-max",
ModelRefs: []ModelRef{{
Model: "model-a",
ModelReasoningControl: ModelReasoningControl{UseReasoning: boolPtr(false)},
}},
Algorithm: &AlgorithmConfig{
Type: "remom",
ReMoM: &ReMoMAlgorithmConfig{BreadthSchedule: []int{32, 16, 8, 4, 3}},
},
}},
},
}
Expect(validateConfigStructure(cfg)).To(Succeed())
})

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

…-project#1456)

ReMoM algorithm accepted arbitrary breadth_schedule values with no
upper bound. A config like breadth_schedule: [100] would fire 101
backend calls for one user request. Similarly, decisions could have
unlimited modelRefs, enabling cost amplification via confidence/ratings.

Fix: add config validation for looper cost bounds.

- Validate breadth_schedule total does not exceed 64 backend calls
- Validate individual breadth_schedule values are positive (no zeros)
- Warn when modelRefs count exceeds 10 per decision
- Validation runs at config parse time (startup, not runtime)

Adds 3 Go unit tests:
- accepts reasonable breadth_schedule (sum under 64)
- rejects excessive breadth_schedule (sum over 64)
- rejects zero breadth value

Fixes vllm-project#1456

Signed-off-by: Yossi Ovadia <yovadia@redhat.com>
- Extract validateReMoMBreadthSchedule to its own function (keeps
  validateDecisionAlgorithmConfig within baseline ratchet)
- Condense modelRefs count warning (keeps validateConfigStructure
  within baseline ratchet)
- Move ReMoM tests to separate Describe block
- Add validator_test.go to legacy_hotspots (pre-existing 286-line
  Describe block was already over the 80-line limit on main)

Signed-off-by: Yossi Ovadia <yovadia@redhat.com>
@yossiovadia yossiovadia force-pushed the fix/looper-config-validation branch from 586121f to 728069f Compare March 18, 2026 19:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

security: no limits on looper breadth_schedule or modelRefs count — cost explosion risk

6 participants