Skip to content

feat:add support to invalidate KV cache via AllBlocksCleared event#437

Open
yash9263 wants to merge 14 commits intollm-d:mainfrom
yash9263:feat-invalidate-kv-cache
Open

feat:add support to invalidate KV cache via AllBlocksCleared event#437
yash9263 wants to merge 14 commits intollm-d:mainfrom
yash9263:feat-invalidate-kv-cache

Conversation

@yash9263
Copy link
Copy Markdown

@yash9263 yash9263 commented Mar 18, 2026

Issue: #396

This PR adds support to Invalidate KV cache via the AllBlocksCleared event.

  • Updated event processing in vllm_adapter.go and pool.go to parse and handle the AllBlocksCleared event.
  • Added Clear method to the Index interface and implemented it for all the index types (CostAwareMemoryIndex, InMemoryIndex, RedisIndex).
  • Introduced podToRequestKeys reverse index: podIdentifier -> []engineKeyToRequestKeyMapping for all the indexes.
    • Clear: Looks up only the request keys for the target pod via podToRequestKeys
    • Iterates those keys, and removes matching entries, prunes empty hashes + their engine-key mappings
    • Updates the reverse index: drops the pod entirely if DeviceTier is empty, or trims to still-alive request keys if a specific tier was targeted

Testing

  • pkg/kvcache/kvblock/index_test.go

Examples:

  • Updated the kv_events/offline and valkey_example to simulate the Clear event.

@github-actions
Copy link
Copy Markdown

Unsigned commits detected! Please sign your commits.

For instructions on how to set up GPG/SSH signing and verify your commits, please see GitHub Documentation.

@yash9263 yash9263 force-pushed the feat-invalidate-kv-cache branch from 61fa5d4 to f1691b7 Compare March 19, 2026 07:07
@yash9263 yash9263 force-pushed the feat-invalidate-kv-cache branch from f1691b7 to ef9c002 Compare March 19, 2026 07:15
@vMaroon
Copy link
Copy Markdown
Member

vMaroon commented Mar 19, 2026

Hi @yash9263 - thank you for the contribution. One detail that may have not been clear from the issue: an event coming from a certain endpoint should only clear the latter's entries.

E.g., if the index state maps pod A having blocks [x, y, z], and pod B having blocks [x, y], then an AllBlocksCleared event from pod A should remove the mappings of [x, y] -> pod A while keeping pod B, and completely remove z since now it maps to no one.

@yash9263
Copy link
Copy Markdown
Author

E.g., if the index state maps pod A having blocks [x, y, z], and pod B having blocks [x, y], then an AllBlocksCleared event from pod A should remove the mappings of [x, y] -> pod A while keeping pod B, and completely remove z since now it maps to no one.

Hi @vMaroon, apologies for the oversight and not clarifying this earlier.

Given the indexes are structured engineKey -> requestKey -> podIdentifier, to remove all the blocks associated with a pod, it will require traversing each block to remove the pod mappings and remove the engineKey if the requestKey is empty or removed. For the Redis index, it may require multiple round-trips.

Should we also consider deviceTier while evicting blocks for the given podIdentifier here as well? Since it was part of the original event struct:

  • only remove blocks for the specified device tier
  • Remove blocks across gpu and cpu, if no device tier is provided?

Would it be viable to introduce a reverse index pod -> engineKey to make evicting all blocks for a pod more efficient?
Or I can proceed with the O(n) scan and evict approach.

@yash9263 yash9263 marked this pull request as draft March 20, 2026 19:46
@yash9263 yash9263 force-pushed the feat-invalidate-kv-cache branch from 5215b68 to be894f6 Compare March 24, 2026 19:49
@yash9263 yash9263 marked this pull request as ready for review March 24, 2026 19:55
@yash9263
Copy link
Copy Markdown
Author

@vMaroon, ready for review.

@github-actions github-actions bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Mar 31, 2026
Signed-off-by: yashwant <yashwant8530@gmail.com>
@yash9263 yash9263 requested a review from gyliu513 April 1, 2026 20:10
@yash9263 yash9263 requested a review from gyliu513 April 2, 2026 12:38
@gyliu513
Copy link
Copy Markdown
Contributor

gyliu513 commented Apr 2, 2026

@yash9263 I think you need rebase your PR to latest code to resolve conflict?

Copy link
Copy Markdown
Contributor

@gyliu513 gyliu513 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good progress, thanks @yash9263 !

@yash9263 yash9263 requested a review from gyliu513 April 3, 2026 16:19
@gyliu513
Copy link
Copy Markdown
Contributor

gyliu513 commented Apr 3, 2026

@vMaroon @sagearc @yankay I think this is in good shape, it is great if you guys can check if this can be merged, thanks!

@yash9263 yash9263 requested a review from gyliu513 April 4, 2026 17:57
Copy link
Copy Markdown
Contributor

@gyliu513 gyliu513 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Thanks @yash9263 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants