Upgrade `github.com/llm-d/llm-d-kv-cache-manager` import to `v0.3.0` by vMaroon · Pull Request #344 · llm-d/llm-d-inference-scheduler

vMaroon · 2025-09-10T15:13:54Z

Summary

This PR upgrades the github.com/llm-d/llm-d-kv-cache-manager import the the v0.3.0 release.

Note that this does not utilize the new preprocessing package for chat-completions support - this will followup in v0.4.0.

Benchmarking & Tests

Overall v0.3.0 includes significant test coverage boosts
Benchmarking data on the 37-capacity setup (open link for context)

Summary across QPS

Experiment	Output toks/s	Requests/s	Success Rate	TTFT p90 (s)	TTFT mean (s)	ITL mean (s)	ITL p50/ p90 (s)
precise v0.3.0-rc1	5533.7	6.920	100.00%	0.222	0.169	0.019	0.0000/0.094
precise v0.2.1	5650.0	6.914	100.00%	0.275	0.193	0.020	0.0000/0.085

EPP Queue and KV Cache Metrics Summary

Experiment	Wait Queue (mean/p90/max)	KV Cache % (mean/p90/max)	Pods	Data Points
precise v0.2.1	0.2/0/11	49.8/70.5/76.9	4	20824
precise v0.3.0-rc1	0.0/0/1	49.0/67.0/83.9	4	52920

Cool graphs

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

nirrozenbaum · 2025-09-10T15:23:02Z

/lgtm
/approve

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

upgrade github.com/llm-d/llm-d-kv-cache-manager import

caa2fe4

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

github-actions bot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 10, 2025

github-actions bot approved these changes Sep 10, 2025

View reviewed changes

github-actions bot merged commit b78eefc into llm-d:main Sep 10, 2025
5 checks passed

nirrozenbaum pushed a commit that referenced this pull request Sep 17, 2025

upgrade github.com/llm-d/llm-d-kv-cache-manager import (#344)

467a51a

Signed-off-by: Maroon Ayoub <maroon.ayoub@ibm.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upgrade `github.com/llm-d/llm-d-kv-cache-manager` import to `v0.3.0`#344

Upgrade `github.com/llm-d/llm-d-kv-cache-manager` import to `v0.3.0`#344
github-actions[bot] merged 1 commit intollm-d:mainfrom
vMaroon:kvc-v0.3

vMaroon commented Sep 10, 2025 •

edited

Loading

Uh oh!

nirrozenbaum commented Sep 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

vMaroon commented Sep 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Benchmarking & Tests

Summary across QPS

EPP Queue and KV Cache Metrics Summary

Cool graphs

Uh oh!

nirrozenbaum commented Sep 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vMaroon commented Sep 10, 2025 •

edited

Loading