Commit 7b9b77a
🐛 Fix e2e tests by adding DECODE_REPLICAS override
PR llm-d/llm-d#619 changed decode.replicas from 2 to 8, requiring
16 GPUs (8 pods × 2 GPUs each). This breaks e2e tests which don't
have enough GPU resources available.
Add DECODE_REPLICAS environment variable support to install.sh,
similar to the existing VLLM_MAX_NUM_SEQS pattern. Set it to 1
in the OpenShift e2e workflow so tests start with minimal resources
and let the HPA scale as needed.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Andrew Anderson <andy@clubanderson.com>1 parent e40d310 commit 7b9b77a
2 files changed
Lines changed: 14 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
475 | 475 | | |
476 | 476 | | |
477 | 477 | | |
| 478 | + | |
| 479 | + | |
478 | 480 | | |
479 | 481 | | |
480 | 482 | | |
| |||
485 | 487 | | |
486 | 488 | | |
487 | 489 | | |
| 490 | + | |
488 | 491 | | |
489 | 492 | | |
490 | 493 | | |
| |||
530 | 533 | | |
531 | 534 | | |
532 | 535 | | |
| 536 | + | |
| 537 | + | |
533 | 538 | | |
534 | 539 | | |
535 | 540 | | |
536 | 541 | | |
537 | 542 | | |
| 543 | + | |
538 | 544 | | |
539 | 545 | | |
540 | 546 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
94 | 94 | | |
95 | 95 | | |
96 | 96 | | |
| 97 | + | |
| 98 | + | |
97 | 99 | | |
98 | 100 | | |
99 | 101 | | |
| |||
758 | 760 | | |
759 | 761 | | |
760 | 762 | | |
| 763 | + | |
| 764 | + | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
761 | 769 | | |
762 | 770 | | |
763 | 771 | | |
| |||
0 commit comments