Skip to content

Commit 3723b4b

Browse files
committed
add vLLMCacheHitRate printf
Signed-off-by: Nick Mitchell <[email protected]>
1 parent a6e3ff2 commit 3723b4b

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

vllm/v1/core/kv_cache_manager.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -203,6 +203,7 @@ def get_computed_blocks(self, request: Request) -> tuple[KVCacheBlocks, int]:
203203
request.block_hashes, max_cache_hit_length
204204
)
205205
)
206+
print(f"vLLMCacheHitRate {100*(num_new_computed_tokens/(len(request.block_hashes)*self.block_size)):.2f}% computed={num_new_computed_tokens} requested={len(request.block_hashes)*self.block_size}", flush=True)
206207

207208
if self.log_stats:
208209
assert self.prefix_cache_stats is not None

0 commit comments

Comments
 (0)