Skip to content

docs: Add a setup documentation about examples/kv-cache-index#38

Merged
vMaroon merged 2 commits intollm-d:mainfrom
buraksekili:docs/example-kv-cache-index
Jun 26, 2025
Merged

docs: Add a setup documentation about examples/kv-cache-index#38
vMaroon merged 2 commits intollm-d:mainfrom
buraksekili:docs/example-kv-cache-index

Conversation

@buraksekili
Copy link
Contributor

This PR re-adds the setup docs related to examples/kv-cache-index/main.go, with minor updates:

  • explains environment variables needed for examples/kv-cache-index/main.go
  • parse those environment variables in the code,
  • update context cancellation just in case

@vMaroon
Copy link
Member

vMaroon commented Jun 6, 2025

Thanks @buraksekili. I think this can be promoted to a deployment doc instead of a phase-1 example, since "phase-1" could be confusing at this stage.

You can port and optionally improve the helm-guide, and perhaps extend it with the deployment of the inference-scheduler if you feel that it serves your needs. What do you think?

@buraksekili
Copy link
Contributor Author

Sure, sounds good to me! Just to confirm, does that mean I can also use llm-d-inference-sim here for demonstrating the kv-cache-manager examples?

@vMaroon
Copy link
Member

vMaroon commented Jun 6, 2025

The simulator does not simulate KV-cache events yet. But in all cases, the helm-chart covers the vLLM deployments with LMCache and the Redis instance.

Then the kv-cache-index example can be used, but the model name must be aligned with that configured in the helm-chart.
It would be more interesting to have a simple router that utilizes the scorer example instead, but that can follow up (should be tracked in an issue). Does this make sense to you?

@buraksekili
Copy link
Contributor Author

Thank you, @vMaroon, for your help here! I've just updated the docs and the example manager according to the charts.

Your suggestion about extending the current charts with the scheduler sounds good to me. However, due to limited time, I won’t be able to look into it until later next week. If no one picks it up by then, I’ll be happy to take a look.

Copy link
Member

@vMaroon vMaroon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this contribution. This will certainly help other users!
Added some comments.

@buraksekili
Copy link
Contributor Author

Thanks @vMaroon , I've updated the PR based on your suggestions. could you please have a look at the PR when you have availability?

@buraksekili buraksekili force-pushed the docs/example-kv-cache-index branch from 4f0d841 to 3e2f6be Compare June 16, 2025 08:34
Copy link
Member

@vMaroon vMaroon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the update, apologies for the delay in reviewing. Minor changes left then ready to go!

@buraksekili buraksekili force-pushed the docs/example-kv-cache-index branch from 908ede4 to a22a39f Compare June 22, 2025 13:52
@buraksekili
Copy link
Contributor Author

Thank you @vMaroon for the help! I have updated the code accordingly. Please have a look

Copy link
Member

@vMaroon vMaroon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, apologies for the long process, final set of suggestions then merging.

@buraksekili buraksekili force-pushed the docs/example-kv-cache-index branch from 2e3d54a to 770ada5 Compare June 25, 2025 06:50
…to parse redis related envionment variables

Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com>

move docs to deployment subfolder for brevity

Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com>

use lower-case version of the vllm model label in the vllm deploment metadata, to prevent Kubernetes issues with models that contain upper-letters in their names

Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com>

implement suggested changes according to the review

Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com>

update release name for vllm helm deployment, to make it align with the purpose of the deployment

Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com>

exit in case of errors in example

Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com>

fix linter

Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com>

Merge

Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com>

add docs about kv-cache-index setup, and allow the example code base to parse redis related envionment variables

Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com>

Apply suggestions from code review

Co-authored-by: Maroon Ayoub <Maroonay@gmail.com>
Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com>

fix duplicate package

Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com>
@buraksekili buraksekili force-pushed the docs/example-kv-cache-index branch from 770ada5 to 33d1260 Compare June 25, 2025 06:53
Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com>
Copy link
Member

@vMaroon vMaroon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution!

LGTM

@vMaroon vMaroon merged commit 2d3b68d into llm-d:main Jun 26, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants