docs: Add a setup documentation about examples/kv-cache-index#38
docs: Add a setup documentation about examples/kv-cache-index#38vMaroon merged 2 commits intollm-d:mainfrom
Conversation
|
Thanks @buraksekili. I think this can be promoted to a deployment doc instead of a phase-1 example, since "phase-1" could be confusing at this stage. You can port and optionally improve the helm-guide, and perhaps extend it with the deployment of the inference-scheduler if you feel that it serves your needs. What do you think? |
|
Sure, sounds good to me! Just to confirm, does that mean I can also use llm-d-inference-sim here for demonstrating the kv-cache-manager examples? |
|
The simulator does not simulate KV-cache events yet. But in all cases, the helm-chart covers the vLLM deployments with LMCache and the Redis instance. Then the kv-cache-index example can be used, but the model name must be aligned with that configured in the helm-chart. |
|
Thank you, @vMaroon, for your help here! I've just updated the docs and the example manager according to the charts. Your suggestion about extending the current charts with the scheduler sounds good to me. However, due to limited time, I won’t be able to look into it until later next week. If no one picks it up by then, I’ll be happy to take a look. |
vMaroon
left a comment
There was a problem hiding this comment.
Thank you for this contribution. This will certainly help other users!
Added some comments.
|
Thanks @vMaroon , I've updated the PR based on your suggestions. could you please have a look at the PR when you have availability? |
4f0d841 to
3e2f6be
Compare
vMaroon
left a comment
There was a problem hiding this comment.
Thank you for the update, apologies for the delay in reviewing. Minor changes left then ready to go!
908ede4 to
a22a39f
Compare
|
Thank you @vMaroon for the help! I have updated the code accordingly. Please have a look |
vMaroon
left a comment
There was a problem hiding this comment.
Thanks, apologies for the long process, final set of suggestions then merging.
2e3d54a to
770ada5
Compare
…to parse redis related envionment variables Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com> move docs to deployment subfolder for brevity Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com> use lower-case version of the vllm model label in the vllm deploment metadata, to prevent Kubernetes issues with models that contain upper-letters in their names Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com> implement suggested changes according to the review Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com> update release name for vllm helm deployment, to make it align with the purpose of the deployment Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com> exit in case of errors in example Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com> fix linter Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com> Merge Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com> add docs about kv-cache-index setup, and allow the example code base to parse redis related envionment variables Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com> Apply suggestions from code review Co-authored-by: Maroon Ayoub <Maroonay@gmail.com> Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com> fix duplicate package Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com>
770ada5 to
33d1260
Compare
Signed-off-by: Burak Sekili <32663655+buraksekili@users.noreply.github.com>
vMaroon
left a comment
There was a problem hiding this comment.
Thank you for your contribution!
LGTM
This PR re-adds the setup docs related to examples/kv-cache-index/main.go, with minor updates: