Commit bf8c8ad
authored
refactor(scheduler): delay caching instead of undoing (#525)
Problem:
The scheduler speculatively cached blocks during allocate_slots, then
had to undo the caching (via undo_uncomputed_block_caching) in three
places: spec_decode_cap trimming, prefill over-allocation for chunked
prefill, and prefill preempting running decodes. This was error-prone
and coupled the scheduler to KVCacheManager internals (block_pool,
num_cached_block).
Solution:
Pass delay_cache_blocks=True to all allocate_slots calls so no blocks
are cached during allocation. A single finalization loop after all
scheduling decisions calls cache_blocks and schedule_sub_block_indexing
for each actually-scheduled request. This eliminates
undo_uncomputed_block_caching.1 parent 226bc60 commit bf8c8ad
File tree
5 files changed
+107
-79
lines changed- docs
- tests/torch_compile/unit/v1/core
- vllm_rbln/v1/core
5 files changed
+107
-79
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
73 | 73 | | |
74 | 74 | | |
75 | 75 | | |
76 | | - | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
77 | 79 | | |
78 | 80 | | |
79 | 81 | | |
| |||
83 | 85 | | |
84 | 86 | | |
85 | 87 | | |
86 | | - | |
87 | | - | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
88 | 91 | | |
89 | 92 | | |
90 | 93 | | |
| |||
130 | 133 | | |
131 | 134 | | |
132 | 135 | | |
133 | | - | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
134 | 142 | | |
135 | 143 | | |
136 | 144 | | |
| |||
170 | 178 | | |
171 | 179 | | |
172 | 180 | | |
173 | | - | |
| 181 | + | |
174 | 182 | | |
175 | 183 | | |
176 | 184 | | |
177 | | - | |
178 | | - | |
179 | 185 | | |
180 | 186 | | |
181 | 187 | | |
| |||
191 | 197 | | |
192 | 198 | | |
193 | 199 | | |
194 | | - | |
| 200 | + | |
195 | 201 | | |
196 | 202 | | |
197 | 203 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
138 | 138 | | |
139 | 139 | | |
140 | 140 | | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
141 | 181 | | |
142 | 182 | | |
143 | 183 | | |
| |||
Lines changed: 6 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1831 | 1831 | | |
1832 | 1832 | | |
1833 | 1833 | | |
1834 | | - | |
1835 | | - | |
1836 | | - | |
| 1834 | + | |
| 1835 | + | |
| 1836 | + | |
| 1837 | + | |
1837 | 1838 | | |
1838 | 1839 | | |
1839 | 1840 | | |
| |||
1846 | 1847 | | |
1847 | 1848 | | |
1848 | 1849 | | |
1849 | | - | |
| 1850 | + | |
1850 | 1851 | | |
1851 | | - | |
| 1852 | + | |
1852 | 1853 | | |
1853 | 1854 | | |
1854 | 1855 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
469 | 469 | | |
470 | 470 | | |
471 | 471 | | |
472 | | - | |
473 | | - | |
474 | | - | |
475 | | - | |
476 | 472 | | |
477 | 473 | | |
478 | 474 | | |
| |||
484 | 480 | | |
485 | 481 | | |
486 | 482 | | |
487 | | - | |
488 | | - | |
489 | | - | |
490 | | - | |
491 | | - | |
492 | | - | |
493 | | - | |
494 | | - | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
495 | 487 | | |
496 | 488 | | |
497 | 489 | | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
498 | 505 | | |
499 | 506 | | |
500 | 507 | | |
| |||
579 | 586 | | |
580 | 587 | | |
581 | 588 | | |
582 | | - | |
| 589 | + | |
583 | 590 | | |
584 | 591 | | |
585 | 592 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| 15 | + | |
15 | 16 | | |
16 | 17 | | |
17 | 18 | | |
18 | 19 | | |
19 | 20 | | |
20 | 21 | | |
21 | | - | |
| 22 | + | |
22 | 23 | | |
23 | 24 | | |
24 | 25 | | |
| |||
52 | 53 | | |
53 | 54 | | |
54 | 55 | | |
55 | | - | |
56 | | - | |
57 | | - | |
58 | | - | |
59 | | - | |
60 | | - | |
61 | | - | |
62 | | - | |
63 | | - | |
64 | | - | |
65 | | - | |
66 | | - | |
67 | | - | |
68 | | - | |
69 | | - | |
70 | | - | |
71 | | - | |
72 | | - | |
73 | | - | |
74 | | - | |
75 | | - | |
76 | | - | |
77 | | - | |
78 | | - | |
79 | 56 | | |
80 | 57 | | |
81 | 58 | | |
| |||
264 | 241 | | |
265 | 242 | | |
266 | 243 | | |
| 244 | + | |
| 245 | + | |
267 | 246 | | |
268 | 247 | | |
269 | 248 | | |
| |||
390 | 369 | | |
391 | 370 | | |
392 | 371 | | |
393 | | - | |
394 | | - | |
395 | | - | |
396 | | - | |
397 | | - | |
398 | | - | |
399 | | - | |
400 | | - | |
401 | 372 | | |
402 | 373 | | |
403 | 374 | | |
| |||
643 | 614 | | |
644 | 615 | | |
645 | 616 | | |
646 | | - | |
| 617 | + | |
| 618 | + | |
647 | 619 | | |
648 | 620 | | |
649 | 621 | | |
| |||
662 | 634 | | |
663 | 635 | | |
664 | 636 | | |
665 | | - | |
666 | | - | |
667 | | - | |
668 | | - | |
669 | | - | |
670 | | - | |
671 | | - | |
672 | | - | |
673 | | - | |
674 | | - | |
675 | | - | |
676 | | - | |
677 | 637 | | |
678 | 638 | | |
679 | 639 | | |
| |||
763 | 723 | | |
764 | 724 | | |
765 | 725 | | |
766 | | - | |
767 | | - | |
768 | | - | |
769 | | - | |
770 | | - | |
771 | | - | |
| 726 | + | |
772 | 727 | | |
773 | 728 | | |
774 | 729 | | |
775 | 730 | | |
776 | 731 | | |
777 | 732 | | |
778 | 733 | | |
779 | | - | |
780 | 734 | | |
781 | 735 | | |
782 | 736 | | |
| |||
807 | 761 | | |
808 | 762 | | |
809 | 763 | | |
| 764 | + | |
| 765 | + | |
| 766 | + | |
| 767 | + | |
| 768 | + | |
| 769 | + | |
| 770 | + | |
| 771 | + | |
| 772 | + | |
| 773 | + | |
| 774 | + | |
| 775 | + | |
| 776 | + | |
| 777 | + | |
| 778 | + | |
| 779 | + | |
| 780 | + | |
| 781 | + | |
| 782 | + | |
| 783 | + | |
810 | 784 | | |
811 | 785 | | |
812 | 786 | | |
| |||
0 commit comments