Commit 072ec8b
authored
Finalizing v2 saturation engine (#836)
* feat: add priority, roles, multi-analyzer scoring, and GreedyByScore optimizer
Implement five interconnected enhancements to the V2 saturation engine:
1. Per-model priority config with "{modelID}#{namespace}" lookup and
default fallback via resolveSaturationConfig()
2. P/D role types (prefill/decode/both) on VariantReplicaState,
VariantDecision, and VariantCapacity, injected from deployment
labels (llm-d.ai/role)
3. Per-role capacity aggregation in the V2 analyzer via
aggregateByRole(), with scheduler queue demand split by role:
prefill gets inputTokens, decode gets inputTokens+outputTokens
4. Multi-analyzer scoring infrastructure with AnalyzerScoreConfig
and composite score = priority * sum(requiredCapacity_i * score_i)
5. Rename GreedyBySaturationOptimizer to GreedyByScoreOptimizer with
score-based priority ordering, per-role work units for P/D
disaggregated models, and proportional P/D balancing
* refactor: demand-proportional P/D distribution in GreedyByScoreOptimizer
Replace post-hoc proportionalPDBalance() with demand-proportional
allocation integrated into the fair-share algorithm itself.
Previously, disaggregated models created separate per-role work units
that competed independently in fair-share, then a post-processing step
patched up the P:D ratio based on initial replica counts.
Now each model enters fair-share as a single entity. When allocating
replicas, allocateByRole() distributes them between prefill/decode
roles proportional to their per-role RequiredCapacity. If one role
can't use GPUs (e.g. accelerator exhausted), subsequent iterations
let the other role absorb its share naturally.
Removed: proportionalPDBalance, initTargetsForRole, mergeWorkTargets
Added: allocateByRole, allocateToVariants, roleDemands on modelWork
* fix: prevent role absorption in demand-proportional P/D allocation
When one role cannot fully allocate (e.g., accelerator exhausted), consume
its unallocated share from remaining so it does not overflow to other roles
in subsequent fair-share iterations.
* refactor: use Analyzers list for V2 selection with per-analyzer threshold overrides
Replace analyzerName-based V2 detection with IsV2() that checks for
Analyzers list (new-style) or analyzerName (backward compat). Add
per-analyzer ScaleUpThreshold/ScaleDownBoundary overrides on
AnalyzerScoreConfig with global fallback via EffectiveScaleUpThreshold/
EffectiveScaleDownBoundary methods.
* refactor: convert interfaces tests to Ginkgo
Convert saturation_scaling_test.go from standard testing to Ginkgo
Describe/It/Expect style with DescribeTable for validation cases.
Add suite_test.go for Ginkgo test runner bootstrap.
* refactor: unify enforcer to single EnforcePolicyOnDecisions entry point
Remove V1-specific EnforcePolicy (map-based) and its helpers (applyScaleToZero,
ensureMinimumReplicas). The V1 path now converts targets to decisions first,
then uses EnforcePolicyOnDecisions — the same path as V2. This eliminates the
dependency on VariantSaturationAnalysis for cost data in the enforcer.
* refactor: move SaturationScalingConfig types from interfaces to config package
Move SaturationScalingConfig, AnalyzerScoreConfig, and related constants
(DefaultPriority, DefaultScaleUpThreshold, DefaultScaleDownBoundary) from
internal/interfaces/ to internal/config/ where they belong alongside the
rest of the configuration types. Update all imports across 15 files.1 parent 7f838c4 commit 072ec8b
25 files changed
Lines changed: 1899 additions & 1192 deletions
File tree
- internal
- config
- controller
- engines
- analyzers/saturation_v2
- pipeline
- saturation
- interfaces
- saturation
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
7 | 7 | | |
8 | 8 | | |
9 | 9 | | |
10 | | - | |
| 10 | + | |
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
| |||
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
78 | | - | |
| 78 | + | |
79 | 79 | | |
80 | 80 | | |
81 | 81 | | |
| |||
310 | 310 | | |
311 | 311 | | |
312 | 312 | | |
313 | | - | |
| 313 | + | |
314 | 314 | | |
315 | 315 | | |
316 | 316 | | |
317 | 317 | | |
318 | 318 | | |
319 | | - | |
| 319 | + | |
320 | 320 | | |
321 | 321 | | |
322 | 322 | | |
| |||
358 | 358 | | |
359 | 359 | | |
360 | 360 | | |
361 | | - | |
| 361 | + | |
362 | 362 | | |
363 | 363 | | |
364 | 364 | | |
365 | 365 | | |
366 | 366 | | |
367 | 367 | | |
368 | 368 | | |
369 | | - | |
| 369 | + | |
370 | 370 | | |
371 | | - | |
| 371 | + | |
372 | 372 | | |
373 | | - | |
| 373 | + | |
374 | 374 | | |
375 | 375 | | |
376 | 376 | | |
| |||
410 | 410 | | |
411 | 411 | | |
412 | 412 | | |
413 | | - | |
| 413 | + | |
414 | 414 | | |
415 | 415 | | |
416 | 416 | | |
417 | 417 | | |
418 | 418 | | |
419 | 419 | | |
420 | | - | |
| 420 | + | |
421 | 421 | | |
422 | 422 | | |
423 | 423 | | |
424 | 424 | | |
425 | | - | |
| 425 | + | |
426 | 426 | | |
427 | 427 | | |
428 | 428 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
12 | | - | |
| 12 | + | |
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
| |||
71 | 71 | | |
72 | 72 | | |
73 | 73 | | |
74 | | - | |
75 | | - | |
| 74 | + | |
| 75 | + | |
76 | 76 | | |
77 | 77 | | |
78 | 78 | | |
| |||
303 | 303 | | |
304 | 304 | | |
305 | 305 | | |
306 | | - | |
| 306 | + | |
307 | 307 | | |
308 | 308 | | |
309 | 309 | | |
| |||
339 | 339 | | |
340 | 340 | | |
341 | 341 | | |
342 | | - | |
| 342 | + | |
343 | 343 | | |
344 | 344 | | |
345 | 345 | | |
| |||
388 | 388 | | |
389 | 389 | | |
390 | 390 | | |
391 | | - | |
| 391 | + | |
392 | 392 | | |
393 | 393 | | |
394 | 394 | | |
| |||
411 | 411 | | |
412 | 412 | | |
413 | 413 | | |
414 | | - | |
| 414 | + | |
415 | 415 | | |
416 | 416 | | |
417 | 417 | | |
| |||
451 | 451 | | |
452 | 452 | | |
453 | 453 | | |
454 | | - | |
| 454 | + | |
455 | 455 | | |
456 | 456 | | |
457 | 457 | | |
| |||
463 | 463 | | |
464 | 464 | | |
465 | 465 | | |
466 | | - | |
| 466 | + | |
467 | 467 | | |
468 | 468 | | |
469 | 469 | | |
| |||
472 | 472 | | |
473 | 473 | | |
474 | 474 | | |
475 | | - | |
| 475 | + | |
476 | 476 | | |
477 | 477 | | |
478 | 478 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
11 | | - | |
| 11 | + | |
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| |||
256 | 256 | | |
257 | 257 | | |
258 | 258 | | |
259 | | - | |
| 259 | + | |
260 | 260 | | |
261 | 261 | | |
262 | 262 | | |
| |||
Lines changed: 101 additions & 4 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
| 1 | + | |
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
5 | 9 | | |
6 | 10 | | |
7 | 11 | | |
| |||
44 | 48 | | |
45 | 49 | | |
46 | 50 | | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
47 | 90 | | |
48 | 91 | | |
49 | 92 | | |
| 93 | + | |
| 94 | + | |
50 | 95 | | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
51 | 99 | | |
52 | 100 | | |
53 | 101 | | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
54 | 109 | | |
55 | 110 | | |
56 | 111 | | |
| |||
60 | 115 | | |
61 | 116 | | |
62 | 117 | | |
63 | | - | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
64 | 122 | | |
65 | 123 | | |
66 | 124 | | |
67 | 125 | | |
68 | 126 | | |
69 | 127 | | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
70 | 145 | | |
71 | 146 | | |
72 | 147 | | |
| |||
86 | 161 | | |
87 | 162 | | |
88 | 163 | | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
89 | 168 | | |
90 | 169 | | |
91 | 170 | | |
92 | 171 | | |
93 | 172 | | |
94 | 173 | | |
95 | | - | |
96 | | - | |
| 174 | + | |
| 175 | + | |
97 | 176 | | |
98 | 177 | | |
99 | 178 | | |
| |||
103 | 182 | | |
104 | 183 | | |
105 | 184 | | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
106 | 203 | | |
107 | 204 | | |
108 | 205 | | |
| |||
0 commit comments