Skip to content

feat(kvevents): parse HMA KV event metadata#612

Open
sagearc wants to merge 2 commits into
llm-d:mainfrom
sagearc:hma-kv-event-metadata
Open

feat(kvevents): parse HMA KV event metadata#612
sagearc wants to merge 2 commits into
llm-d:mainfrom
sagearc:hma-kv-event-metadata

Conversation

@sagearc
Copy link
Copy Markdown
Collaborator

@sagearc sagearc commented May 26, 2026

Summary

Parse and carry vLLM HMA KV event metadata in llm-d-kv-cache.

This adds support for the vLLM fields introduced for hybrid KV cache groups in vLLM PR #40984:

class BlockStored(KVCacheEvent):
    ...
    group_idx: int | None = None
    # Store events carry cache-spec metadata so consumers can classify and
    # filter groups as they are learned. Remove events only need group_idx+hash.
    kv_cache_spec_kind: str | None = None
    kv_cache_spec_sliding_window: int | None = None

class BlockRemoved(KVCacheEvent):
    ...
    group_idx: int | None = None

No routing indexing filtering or scoring behavior changes are included in this PR.

Motivation

vLLM emits one KV event per cache group for HMA models. BlockStored carries semantic group metadata while BlockRemoved carries only the structural group identity.

llm-d-kv-cache needs to preserve this metadata before implementing group aware indexing or HMA aware scoring in follow up work. This PR is intended as the small base change for that work.

Notes

BlockRemoved intentionally does not carry kv_cache_spec_kind in current vLLM. The consumer is expected to learn group semantics from BlockStored and use block_hash with group_idx to identify removed group entries.

This PR only preserves the metadata. It does not yet interpret it.

Related:

Copilot AI review requested due to automatic review settings May 26, 2026 17:48
@github-actions github-actions Bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 26, 2026
@github-actions github-actions Bot requested review from hyeongyun0916 and yankay May 26, 2026 17:48
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds support for vLLM HMA (KV cache group metadata) in KV-cache event parsing so downstream consumers can attribute events to cache groups and understand cache semantics.

Changes:

  • Extend BlockStoredEvent/BlockRemovedEvent with group metadata (group index + cache spec kind + sliding window size) and persist BlockSize.
  • Update vLLM adapter to decode new trailing fields (HMA metadata) with backward/forward compatibility.
  • Add/extend tests covering valid/invalid HMA metadata decoding and group index handling.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
pkg/kvevents/events.go Adds KV cache spec kind type/constants and extends event structs with block size + group metadata fields.
pkg/kvevents/engineadapter/vllm_adapter.go Decodes/stores block_size and new HMA trailing fields; adds toUint32 helper.
pkg/kvevents/engineadapter/vllm_adapter_test.go Adds tests for HMA metadata parsing, group_idx parsing, and invalid metadata errors.
pkg/kvevents/engineadapter/sglang_adapter.go Populates BlockSize for SGLang block-stored events for parity.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/kvevents/engineadapter/vllm_adapter.go Outdated
Comment thread pkg/kvevents/engineadapter/vllm_adapter.go
Comment thread pkg/kvevents/engineadapter/vllm_adapter.go
@sagearc sagearc changed the title parse hma kv event metadata feat(kvevents): parse HMA kv event metadata May 26, 2026
@sagearc sagearc changed the title feat(kvevents): parse HMA kv event metadata feat(kvevents): parse HMA KV event metadata May 26, 2026
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
@sagearc sagearc force-pushed the hma-kv-event-metadata branch from 09c490c to bd0172a Compare May 26, 2026 18:04
Signed-off-by: Sage Ahrac <sagiahrak@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants