Commit dfcbb32
committed
fix: disable EAGLE3 speculative decoding for gpt-oss-120b
Streaming responses were consistently dropping the last 1-2 tokens
due to a vLLM v0.12.0 EAGLE3 bug. Non-streaming was unaffected.1 parent 2e31674 commit dfcbb32
1 file changed
Lines changed: 0 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
77 | | - | |
78 | 77 | | |
79 | 78 | | |
80 | 79 | | |
| |||
0 commit comments