Add vllm:prefix_cache_hits and vllm:prefix_cache_queries counters by InfraWhisperer · Pull Request #358 · llm-d/llm-d-inference-sim

InfraWhisperer · 2026-02-23T15:43:08Z

Summary

Add vllm:prefix_cache_hits and vllm:prefix_cache_queries Prometheus counters matching vLLM v1 token-level semantics
queries increments by total prompt tokens per request, hits by cached tokens — enables rate(hits) / rate(queries) for cache effectiveness measurement
Follows existing channel + async updater goroutine pattern (kvCacheUsageChan → kvCacheUsageUpdater)

Closes #356

Test plan

Verify go build ./... passes
Run simulator with --enable-kvcache and confirm both counters appear on /metrics
Send repeated prompts with shared prefixes, verify prefix_cache_hits increments on subsequent requests
Confirm counters stay at zero when --enable-kvcache is not set
Validate rate(vllm:prefix_cache_hits[5m]) / rate(vllm:prefix_cache_queries[5m]) produces expected hit rate in Prometheus

github-actions · 2026-02-23T15:43:21Z

Unsigned commits detected! Please sign your commits.

For instructions on how to set up GPG/SSH signing and verify your commits, please see GitHub Documentation.

Expose token-level prefix cache metrics matching vLLM v1 semantics. Both counters increment per-request: queries by total prompt tokens, hits by tokens found already cached. Enables computing cache hit rate via rate(hits) / rate(queries) for scorer strategy benchmarking. Closes llm-d#356 Signed-off-by: InfraWhisperer <raghav.potluri21@gmail.com>

mayabar

@InfraWhisperer thanks for you PR
Looks good, some general comments:

please add support for fake metrics: check new values validity on initialization, e.g. check that values are non-negative, initialize prometheus values from the fake values given in the configuration, ...
tests for the new metrics are missing: need to test calculation in both scenarios, real and fake metrics

Add PrefixCacheHits and PrefixCacheQueries fields to the fake metrics config struct with validation (non-negative, must be specified together, hits <= queries). Initialize Prometheus counters from fake values in setInitialPrometheusMetrics. Add integration tests covering real prefix cache metrics (sequential requests with shared prefixes), fake prefix cache metrics (values appear on /metrics), and fake value immutability (real requests don't mutate fake counters). Add config validation tests for all error paths including the partial specification guard. Signed-off-by: Raghav Potluri <raghav.potluri21@gmail.com>

InfraWhisperer · 2026-02-28T07:13:50Z

@InfraWhisperer thanks for you PR Looks good, some general comments:

please add support for fake metrics: check new values validity on initialization, e.g. check that values are non-negative, initialize prometheus values from the fake values given in the configuration, ...

tests for the new metrics are missing: need to test calculation in both scenarios, real and fake metrics

Hi @mayabar I have added tests as per your comments.

InfraWhisperer force-pushed the prefix-cache-metrics branch 2 times, most recently from 06409c6 to c2cb138 Compare February 23, 2026 15:46

InfraWhisperer force-pushed the prefix-cache-metrics branch from c2cb138 to bc803b4 Compare February 23, 2026 15:50

mayabar reviewed Feb 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add vllm:prefix_cache_hits and vllm:prefix_cache_queries counters#358

Add vllm:prefix_cache_hits and vllm:prefix_cache_queries counters#358
InfraWhisperer wants to merge 2 commits intollm-d:mainfrom
InfraWhisperer:prefix-cache-metrics

InfraWhisperer commented Feb 23, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 23, 2026

Uh oh!

mayabar left a comment

Uh oh!

InfraWhisperer commented Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

InfraWhisperer commented Feb 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

github-actions bot commented Feb 23, 2026

Uh oh!

mayabar left a comment

Choose a reason for hiding this comment

Uh oh!

InfraWhisperer commented Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

InfraWhisperer commented Feb 23, 2026 •

edited

Loading