Commit 58d6716
authored
Add unique random seed to worker (#340)
When using random data generator, it's expected to see almost 0 prefix
cache hit rate. But during a p/d benchmark run, I checked the vllm debug
log, the prefix cache hit rate is always 90%+
```
(APIServer pid=1) DEBUG 02-02 23:32:28 [v1/metrics/loggers.py:248] Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 92.9%, External prefix cache hit rate: 0.0%
(APIServer pid=1) DEBUG 02-02 23:32:38 [v1/metrics/loggers.py:248] Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 92.9%, External prefix cache hit rate: 0.0%
(APIServer pid=1) DEBUG 02-02 23:32:48 [v1/metrics/loggers.py:248] Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 92.9%, External prefix cache hit rate: 0.0%
(APIServer pid=1) DEBUG 02-02 23:32:58 [v1/metrics/loggers.py:248] Engine 000: Avg prompt throughput: 0.0 tokens/s, Avg generation throughput: 0.0 tokens/s, Running: 0 reqs, Waiting: 0 reqs, GPU KV cache usage: 0.0%, Prefix cache hit rate: 92.9%, External prefix cache hit rate: 0.0%
```
A prefix hit rate graph for the benchmark run:
<img width="1056" height="702" alt="image"
src="https://github.com/user-attachments/assets/428872d1-5baa-4872-a1b6-9f5e9308a11a"
/>
The fix is to create unique random seed per worker, and add a call to
np.random.seed before starting the loop.
Validated the fix locally with multiple benchmark runs.1 parent e6ba4c7 commit 58d6716
1 file changed
+5
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
195 | 195 | | |
196 | 196 | | |
197 | 197 | | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
198 | 203 | | |
199 | 204 | | |
200 | 205 | | |
| |||
0 commit comments