security: add max_evaluation_chars limit to prevent giant prompt DoS by yossiovadia · Pull Request #1455 · vllm-project/semantic-router

yossiovadia · 2026-03-06T20:37:43Z

Summary

Fixes #1454 — signal evaluation latency grows super-linearly with prompt size (10K chars = 21s, 25K+ = timeout). A single client can make the router unresponsive.

Fix

Add max_evaluation_chars config with 8192-char default that truncates evaluation text before any signal processing:

# In router config (optional — default 8192)
max_evaluation_chars: 8192  # set to -1 to disable

Truncation at character level before compression/embedding/classification
8192 chars ≈ ~2K tokens — well within embedding model capacity
Does NOT truncate the request body — only text used for routing decisions
Complements prompt_compression (quality-aware NLP) with a hard safety bound

Changes

File	Change
`pkg/config/config.go`	Add `MaxEvaluationChars` field with documentation
`pkg/extproc/req_filter_classification.go`	Truncate `evaluationText` before signal processing

2 files, 22 insertions.

Test plan

make build-router passes
golangci-lint — 0 issues on changed files
Default 8192 limit applied when config is omitted
Prompts under limit pass through unchanged
Prompts over limit truncated with warning log

netlify · 2026-03-06T20:37:50Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`3fdacb8`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/69bb0acde5525c00080fddc4
😎 Deploy Preview	https://deploy-preview-1455--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2026-03-06T20:51:45Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `src`

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

src/semantic-router/pkg/config/config.go
src/semantic-router/pkg/config/config_test.go
src/semantic-router/pkg/extproc/req_filter_classification.go

📁 `tools`

Owners: @yuluo-yx, @rootfs, @Xunzhuo
Files changed:

tools/agent/structure-rules.yaml

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

rootfs · 2026-03-09T17:27:15Z

@yossiovadia truncating long prompt introduces attention loss, please check the prompt compression feature #1437 that reduces the seq len

yossiovadia · 2026-03-10T00:42:13Z

Prompt compression helps with classification quality but doesn't protect against DoS. In our testing (#1454), a 25K char prompt caused the router to become completely unresponsive — health endpoint stopped responding. This happened because:

Compression is disabled by default — most deployments have no protection
Even when enabled, compression itself is O(n²) on the input sentences — a giant prompt overwhelms the compression step before it even reaches signal evaluation

max_evaluation_chars is a hard safety bound that truncates before any processing (compression or classification). It's defense-in-depth — complements compression, doesn't replace it.

…llm-project#1454) Signal evaluation latency grows super-linearly with prompt size (10K chars = 21s, 25K+ = timeout). Without a hard limit, a single client can make the router unresponsive by sending large prompts. Fix: add max_evaluation_chars config with 8192-char default. - Truncate evaluationText before any signal processing (compression, embedding, classification) - Default 8192 chars (~2K tokens) — within embedding model capacity - Configurable: increase/decrease via config, or disable with -1 - Does NOT truncate the actual request body — only the text used for routing signal evaluation - Complements existing prompt_compression (quality-aware NLP-based reduction) with a hard safety bound (simple char truncation) Fixes vllm-project#1454 Signed-off-by: Yossi Ovadia <yovadia@redhat.com>

…g 281-line function) Signed-off-by: Yossi Ovadia <yovadia@redhat.com>

yossiovadia requested review from Xunzhuo and rootfs as code owners March 6, 2026 20:37

yossiovadia force-pushed the fix/signal-eval-input-limits branch from 111c271 to 9e8c459 Compare March 6, 2026 20:43

github-actions bot assigned rootfs, wangchen615 and Xunzhuo Mar 6, 2026

yossiovadia mentioned this pull request Mar 6, 2026

bug: multi-turn conversations can bounce between models without session affinity #1439

Open

rootfs added the hold label Mar 9, 2026

yossiovadia force-pushed the fix/signal-eval-input-limits branch from 9e8c459 to 0d148c8 Compare March 11, 2026 21:57

yossiovadia force-pushed the fix/signal-eval-input-limits branch from 0d148c8 to bbd431e Compare March 18, 2026 19:49

github-actions bot assigned yuluo-yx Mar 18, 2026

fix: add req_filter_classification.go to legacy hotspots (pre-existin…

3fdacb8

…g 281-line function) Signed-off-by: Yossi Ovadia <yovadia@redhat.com>

yossiovadia force-pushed the fix/signal-eval-input-limits branch from a95fb8a to 3fdacb8 Compare March 18, 2026 20:27

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

security: add max_evaluation_chars limit to prevent giant prompt DoS#1455

security: add max_evaluation_chars limit to prevent giant prompt DoS#1455
yossiovadia wants to merge 2 commits intovllm-project:mainfrom
yossiovadia:fix/signal-eval-input-limits

yossiovadia commented Mar 6, 2026

Uh oh!

netlify bot commented Mar 6, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 6, 2026 •

edited

Loading

Uh oh!

rootfs commented Mar 9, 2026

Uh oh!

yossiovadia commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

yossiovadia commented Mar 6, 2026

Summary

Fix

Changes

Test plan

Uh oh!

netlify bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 src

📁 tools

🎉 Thanks for your contributions!

Uh oh!

rootfs commented Mar 9, 2026

Uh oh!

yossiovadia commented Mar 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

netlify bot commented Mar 6, 2026 •

edited

Loading

github-actions bot commented Mar 6, 2026 •

edited

Loading

📁 `src`

📁 `tools`