Commit 9d049db
authored
fix: use sampling during warmup and disable backed_size_oblivious after model compilation (#551)
# Description
A couple of changes related to compilation of operations during
sampling.
- the [`batched_count_greater_than`
function](https://github.com/vllm-project/vllm/blob/b8b302cde434df8c9289a2b465406b47ebab1c2d/vllm/v1/sample/ops/logprobs.py#L11)
requires compilation and is used to compute logprobs.
- we also found that this function will fail to compile in Pytorch 2.8.0
and 2.9.0 when `backed_size_oblivious` is enabled, so this PR disables
`backed_size_oblivious` after model compilation.
## Related Issues
Fixes #550
---------
Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>1 parent f081f4f commit 9d049db
1 file changed
+27
-6
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
103 | 103 | | |
104 | 104 | | |
105 | 105 | | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
106 | 117 | | |
107 | 118 | | |
108 | 119 | | |
| |||
432 | 443 | | |
433 | 444 | | |
434 | 445 | | |
435 | | - | |
436 | | - | |
437 | | - | |
438 | | - | |
439 | | - | |
440 | | - | |
441 | 446 | | |
442 | 447 | | |
443 | 448 | | |
| |||
491 | 496 | | |
492 | 497 | | |
493 | 498 | | |
| 499 | + | |
| 500 | + | |
| 501 | + | |
| 502 | + | |
| 503 | + | |
| 504 | + | |
| 505 | + | |
| 506 | + | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
494 | 514 | | |
495 | 515 | | |
496 | 516 | | |
| |||
651 | 671 | | |
652 | 672 | | |
653 | 673 | | |
| 674 | + | |
654 | 675 | | |
655 | 676 | | |
656 | 677 | | |
| |||
0 commit comments