KV-events abstraction by NaomiEisen · Pull Request #356 · llm-d/llm-d-kv-cache

NaomiEisen · 2026-02-25T00:10:26Z

Overview

This PR introduces abstraction layers for KV-cache events. The refactoring separates transport protocols, serialization, and engine-specific event structure into distinct layers.

See design docs for full review.

Key Changes

New Abstraction Layers

Transport Layer (pkg/kvevents/transport/): Abstracts communication protocols.
Decoder Layer (pkg/kvevents/decoder/): Abstracts serialization formats.
Engine Adapter Layer (pkg/kvevents/engineadapter/): Converts engine specific events to generic events.

Event Processing Refactor

Moved event processing logic into event structures: Each event type (BlockStoredEvent, BlockRemovedEvent, AllBlocksClearedEvent) now implements its own Process() method.
Removed double marshal/unmarshal: Events are decoded once by the adapter and passed as structured data to the pool.
Added ExtraKeys field to support vLLM's new event format (currently unused).

Testing

Tested on:

Unit tests: pkg/kvevents/engineadapter/vllm_adapter_test.go
tests/integration/kv_events_test.go
pkg/kvevents/subscriber_manager_test.go
examples/kv_events/online

In progress: Performance tests (benchmarking with llm-d stack)

…stractions for multi-engine support. Modify Pool and Subscribers to use new layers.

github-actions · 2026-02-25T00:10:35Z

Unsigned commits detected! Please sign your commits.

For instructions on how to set up GPG/SSH signing and verify your commits, please see GitHub Documentation.

NaomiEisen · 2026-02-25T00:19:07Z

examples/helper/events.go

-		Medium:          &medium,
-		LoraName:        nil,
+
+	// Create event in vLLM msgpack array format: [tag, hashes, parent, tokens, blockSize, loraID, medium, loraName]


Previously, test events were created using specific event structures and then converted to a tagged union format via ToTaggedUnion(). This tagged union matched the exact format vllm sends to llm-d. The tagged union structure was necessary because of double marshaling: first to extracted the event type tag, and the second for the actual event data. I avoided it so I completely removed the ToTaggedUnion().

NaomiEisen · 2026-02-25T00:24:03Z

examples/kv_events/vllm/vllm_kv_cache_demo.py

        kv_events_config=kv_events_config,
        block_size=16,
-        prefix_caching_hash_algo="sha256_cbor",
+        prefix_caching_hash_algo="sha256_cbor_64bit",


Had this error when running the test:
INFO 02-24 02:10:17 [__init__.py:235] Automatically detected platform cuda. usage: vllm serve [model_tag] [options] vllm serve: error: argument --prefix-caching-hash-algo: invalid choice: 'sha256_cbor' (choose from builtin, sha256, sha256_cbor_64bit)

NaomiEisen · 2026-02-25T00:26:08Z

pkg/kvevents/engineadapter/vllm_adapter.go

+// getHashAsUint64 converts vLLM hash formats (uint64 or []byte) to uint64.
+// This handles both legacy uint64 hashes and new []byte hashes by taking
+// the last 8 bytes and interpreting them as a big-endian integer.
+func (v *VLLMAdapter) getHashAsUint64(raw any) (uint64, error) {


Maybe it should be a general/utility function rather than 'vllm-specific'

NaomiEisen · 2026-02-25T00:27:46Z

pkg/kvevents/engineadapter/vllm_adapter.go

+// parseVLLMTopic extracts pod ID and model name from vLLM topic format.
+// Expected format: "pod_id@model_name"
+// TODO: Find a way to avoid it
+func parseVLLMTopic(topic string) (podID, modelName string) {


I kept the same logic as before

NaomiEisen · 2026-02-25T00:30:45Z

pkg/kvevents/engineadapter/vllm_adapter.go

+	return &events.AllBlocksClearedEvent{}, nil
+}
+
+// TODO: not sure if it best to keep or remove these


I'm not sure whether it's better to abstract the inner structures from the subscriber (so it only uses the adapter) or to make it use those methods directly from the transport

NaomiEisen · 2026-02-25T00:39:52Z

examples/kv_events/pod_reconciler/pod_reconciler.go

 	}

 	// Check if pod matches our label selector
 	if !r.Config.PodLabelSelector.Matches(labels.Set(pod.Labels)) {


We might need to introduce an inference engine as one of the pods identifiers

NaomiEisen · 2026-02-25T00:43:57Z

vllm-setup-helm/templates/deployment.yaml

              {{- if .Values.kvCacheManager.enabled }}
              --kv-events-config "{\"enable_kv_cache_events\":{{ .Values.kvCacheManager.enabled }},\"publisher\":\"zmq\",\"endpoint\":\"{{ include "chart.kvCacheManagerServiceUrl" . }}\",\"topic\":\"kv@${POD_IP}@{{ .Values.vllm.model.name }}\"}" \
-              --prefix-caching-hash-algo sha256_cbor \
+              --prefix-caching-hash-algo sha256_cbor_64bit \


Had this error:
INFO 02-24 02:10:17 [__init__.py:235] Automatically detected platform cuda. usage: vllm serve [model_tag] [options] vllm serve: error: argument --prefix-caching-hash-algo: invalid choice: 'sha256_cbor' (choose from builtin, sha256, sha256_cbor_64bit)

NaomiEisen · 2026-02-25T00:48:37Z

pkg/kvevents/subscriber.go

@@ -0,0 +1,145 @@
+// Copyright 2025 The llm-d Authors.


This file is very similar to the previous zmq_subscriber.go. I'm not sure why it's not just showing as 'renamed' + the changed lines. If it's difficult to compare, I can try to fix it

NaomiEisen added 6 commits February 23, 2026 17:08

Refactor kvevents: Introduce decoder, transport and engine adapter ab…

f5ae878

…stractions for multi-engine support. Modify Pool and Subscribers to use new layers.

Add ExtraKey field to vLLM adapter

10d4d53

Update prefix-caching-hash-algo from sha256_cbor to sha256_cbor_64bit

1289fc4

Remove TODO

404c89c

update vLLM adapter tests to include ExtraKeys field

c55e86c

remove note

63100f5

NaomiEisen requested review from dannyharnik, kfirtoledo and vMaroon as code owners February 25, 2026 00:10

vMaroon requested review from hyeongyun0916, liu-cong, sagearc and yankay February 25, 2026 00:10

NaomiEisen commented Feb 25, 2026

View reviewed changes

NaomiEisen marked this pull request as draft February 25, 2026 00:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KV-events abstraction#356

KV-events abstraction#356
NaomiEisen wants to merge 6 commits intollm-d:mainfrom
NaomiEisen:kvevents-abstraction

NaomiEisen commented Feb 25, 2026

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

NaomiEisen Feb 25, 2026

Uh oh!

NaomiEisen Feb 25, 2026

Uh oh!

NaomiEisen Feb 25, 2026

Uh oh!

NaomiEisen Feb 25, 2026

Uh oh!

NaomiEisen Feb 25, 2026

Uh oh!

NaomiEisen Feb 25, 2026

Uh oh!

NaomiEisen Feb 25, 2026

Uh oh!

NaomiEisen Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

NaomiEisen commented Feb 25, 2026

Overview

Key Changes

Testing

Uh oh!

github-actions bot commented Feb 25, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant