Skip to content

cache config llm-d#5

Draft
DolevAdas wants to merge 2 commits intollm-d-incubation:mainfrom
DolevAdas:cache-config
Draft

cache config llm-d#5
DolevAdas wants to merge 2 commits intollm-d-incubation:mainfrom
DolevAdas:cache-config

Conversation

@DolevAdas
Copy link
Copy Markdown

Modify cache memory settings in existing llm-d deployments without full redeployment.
Adjust GPU memory utilization,
KV cache capacity,
shared memory,
block size,
and context length to optimize performance for different workload patterns.

Signed-off-by: Dolev Adas <dolev.adas@ibm.com>
@DolevAdas DolevAdas changed the title cache-config-llm-d skill cache config llm-d Apr 5, 2026
Signed-off-by: Dolev Adas <dolev.adas@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant