Commit 10a4ce1
authored
Fix: Multi-VA E2E test failure due to missing accelerator resolution for new VAs (llm-d#922)
* fix: resolve accelerator bootstrapping deadlock for new VAs in multi-VA E2E test
New VAs without prior metrics get stuck in applySaturationDecisions() because
acceleratorName cannot be resolved from empty VA status or currentAllocations.
This prevents HPA metric emission, creating a 3+ minute blind period where
the HPA cannot scale the deployment.
Add fallback accelerator resolution from deployment nodeSelector/nodeAffinity
and VA label. Pre-load guidellm image into Kind cluster to eliminate runtime
pull delays. Increase load job timeout from 5 to 8 minutes to account for
tokenizer download on first run.
* fix: use burst load for multi-VA test to trigger simulator KV cache tracking
The simulator only tracks KV cache for /v1/completions requests. guidellm
defaults to /v1/chat/completions, which bypasses KV cache tracking entirely.
This causes avgSpareKv to remain high (0.8) despite active load, preventing
the saturation engine from triggering scale-up.
Switch to burst load (curl) targeting /v1/completions directly, matching the
pattern used by the working smoke scale-up test. Use 2400 prompts with 400
output tokens to sustain load across multiple engine cycles.
* fix: wait for scale-up instead of job completion in multi-VA E2E test
The burst load jobs send 2400 requests at ~42s each, which takes ~84
minutes to complete — far exceeding the 10-minute test timeout. On Kind
the lower network latency masks this, but on OpenShift the jobs always
time out.
Match the proven smoke test pattern: verify load jobs are running, wait
for the saturation engine to detect load and scale up VA-A, then check
the cost preference assertion.1 parent c967af2 commit 10a4ce1
3 files changed
Lines changed: 100 additions & 13 deletions
File tree
- deploy/kind-emulator
- internal/engines/saturation
- test/e2e
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
45 | 48 | | |
46 | 49 | | |
47 | 50 | | |
| |||
129 | 132 | | |
130 | 133 | | |
131 | 134 | | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
132 | 138 | | |
133 | 139 | | |
134 | 140 | | |
| |||
206 | 212 | | |
207 | 213 | | |
208 | 214 | | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
209 | 251 | | |
210 | 252 | | |
211 | 253 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
973 | 973 | | |
974 | 974 | | |
975 | 975 | | |
| 976 | + | |
| 977 | + | |
| 978 | + | |
| 979 | + | |
| 980 | + | |
| 981 | + | |
| 982 | + | |
| 983 | + | |
| 984 | + | |
| 985 | + | |
| 986 | + | |
| 987 | + | |
| 988 | + | |
| 989 | + | |
| 990 | + | |
| 991 | + | |
| 992 | + | |
| 993 | + | |
| 994 | + | |
| 995 | + | |
976 | 996 | | |
977 | 997 | | |
978 | 998 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
425 | 425 | | |
426 | 426 | | |
427 | 427 | | |
| 428 | + | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
428 | 435 | | |
429 | 436 | | |
430 | | - | |
431 | | - | |
| 437 | + | |
| 438 | + | |
432 | 439 | | |
433 | | - | |
| 440 | + | |
434 | 441 | | |
435 | 442 | | |
436 | 443 | | |
437 | | - | |
438 | | - | |
439 | | - | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
440 | 447 | | |
441 | 448 | | |
442 | | - | |
443 | | - | |
| 449 | + | |
| 450 | + | |
444 | 451 | | |
445 | 452 | | |
446 | 453 | | |
| |||
468 | 475 | | |
469 | 476 | | |
470 | 477 | | |
471 | | - | |
| 478 | + | |
472 | 479 | | |
473 | 480 | | |
474 | 481 | | |
475 | | - | |
476 | | - | |
| 482 | + | |
| 483 | + | |
477 | 484 | | |
478 | 485 | | |
479 | 486 | | |
480 | 487 | | |
481 | | - | |
482 | | - | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
483 | 508 | | |
484 | 509 | | |
485 | 510 | | |
| |||
0 commit comments