Commit 3f7b266
fix: reduce gpt-oss-120b GPU memory utilization to 0.90
Lower --gpu-memory-utilization from 0.95 to 0.90 to address CUDA OOM
crashes in vllm-gpt-oss containers under load.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>1 parent d92a32e commit 3f7b266
1 file changed
Lines changed: 1 addition & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
70 | 70 | | |
71 | 71 | | |
72 | 72 | | |
73 | | - | |
| 73 | + | |
74 | 74 | | |
75 | 75 | | |
76 | 76 | | |
| |||
0 commit comments