Open
Conversation
parseSaturationConfig() called Validate() without first calling ApplyDefaults(), causing V2 configs with analyzerName: saturation to fail validation because scaleUpThreshold/scaleDownBoundary default to zero (omitempty) and Validate() rejects zero values. This caused the engine to skip all models with "Saturation scaling config not loaded yet for namespace", resulting in no scaling decisions.
When the model server does not emit the vllm:cache_config_info metric (e.g., llm-d-inference-sim), TotalKvCapacityTokens is 0 and the V2 analyzer skipped the replica entirely, resulting in totalDemand=0 and no scale-up decisions. Add computeReplicaCapacityFallback that uses the deployment-derived capacity from the capacity store and estimates demand from KvCacheUsage percentage. This allows V2 to produce scaling decisions with any vLLM-compatible server, not just those emitting cache_config_info.
Owner
Author
|
/ok-to-test |
|
🚀 Kind E2E (full V1+V2) triggered by |
|
🚀 OpenShift E2E — approve and run ( |
|
This PR is marked as stale after 21d of inactivity. After an additional 14d of inactivity (7d to become rotten, then 7d more), it will be closed. To prevent this PR from being closed, add a comment or remove the |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
test