-
Notifications
You must be signed in to change notification settings - Fork 638
feat: Emit KV events from vllm KVBM #3586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: krishung5 <[email protected]>
Signed-off-by: krishung5 <[email protected]>
333eb49
to
3d650c2
Compare
return; | ||
} | ||
|
||
// todo: get namespace and component name from the component/config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this TODO for this PR or future follow up?
WalkthroughAdds optional KV event publishing to the block manager: introduces KvEventPublisher wiring from Python bindings and vLLM leader into runtime config; implements DynamoEventManager to publish store/remove events; updates resource initialization to select EventManager from explicit config, runtime publisher, or Null; removes legacy NATS-based dynamo module; adds tests. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor App
participant Leader as vLLM Leader
participant Bld as BlockManagerBuilder
participant Cfg as KvManagerRuntimeConfig
participant Res as Resources
participant EM as EventManager
participant Pub as KvEventPublisher
App->>Leader: initialize()
Leader->>Leader: read env (namespace, component)
Leader->>Leader: get worker_id from DRt lease
Leader->>Pub: try create KvEventPublisher
alt publisher created
Leader->>Bld: kv_event_publisher(Some(Pub))
else failure
Leader->>Bld: kv_event_publisher(None)
end
Bld->>Cfg: build (attach optional publisher)
Cfg->>Res: pass runtime config
alt explicit event_manager in config
Res->>EM: use provided EventManager
else runtime has publisher
Res->>EM: create DynamoEventManager(Pub)
else
Res->>EM: use NullEventManager
end
sequenceDiagram
autonumber
participant Client
participant BM as BlockManager / Registry
participant EM as DynamoEventManager
participant Pub as KvEventPublisher
Note over Client,BM: Store flow
Client->>BM: register_blocks([...])
BM-->>Client: RegistrationHandle(s)
Client-->>BM: drop RegistrationHandle(s)
BM->>EM: publish(handles)
EM->>EM: build KvCacheEvent(Stored)
EM->>Pub: publish(event)
Note over Client,BM: Remove flow
Client-->>BM: drop BlockState
BM->>EM: block_release(handle)
EM->>EM: build KvCacheEvent(Removed)
EM->>Pub: publish(event)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
Pre-merge checks❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (7)
lib/llm/src/block_manager/state.rs (1)
249-254
: Fix mislabeled pool name ("disk" vs "device")Label is unused but confusing in logs. Prefer accuracy.
Apply:
- create_block_pool::<_, _, Metadata>(factory, &resources, "disk")?; + create_block_pool::<_, _, Metadata>(factory, &resources, "device")?;lib/llm/src/block_manager/block/registry.rs (1)
302-303
: Typo in test comment"we shoudl" → "we should".
- // we shoudl get two events when this is dropped, since we never took ownership of the RegistrationHandle + // we should get two events when this is dropped, since we never took ownership of the RegistrationHandlelib/bindings/python/rust/llm/block_manager.rs (1)
295-301
: Builder setter API ergonomicsSetter taking Option<Arc<...>> works, but most call sites will already have an Option. Consider also offering an overload that takes Arc and wraps it, to avoid Some(...) at call sites when the value is present.
impl BlockManagerBuilder { + pub fn kv_event_publisher_arc( + mut self, + publisher: Arc<dynamo_llm::kv_router::publisher::KvEventPublisher>, + ) -> Self { + self.kv_event_publisher = Some(publisher); + self + }lib/bindings/python/rust/llm/block_manager/vllm/connector/leader.rs (1)
127-135
: Consider sourcing namespace/component from config over env varsEnv fallbacks are fine, but if the runtime has canonical config for namespace/component, prefer that and allow env to override. Reduces drift between services.
lib/llm/src/block_manager/pool/managed.rs (1)
88-102
: Fallback to NullEventManager is fine; align logs with upstream selectionThis locally falls back to NullEventManager. Since state/resources.rs may already inject a real DynamoEventManager, consider a trace here noting whether an explicit event manager was provided to avoid confusing logs.
lib/llm/src/block_manager/events.rs (2)
214-253
: Publish(Store) builds event correctly; minor nits
- OK to derive parent_hash from first handle.
- Minor: store_data.clone() is unused after constructing event; cloning is unnecessary.
- let event = KvCacheEvent { - event_id: self.next_event_id(), - data: KvCacheEventData::Stored(store_data.clone()), - }; + let event = KvCacheEvent { + event_id: self.next_event_id(), + data: KvCacheEventData::Stored(store_data), + };
255-277
: Publish(Remove) path OK; atomic ordering can be relaxed
- Event shape and error logging are fine.
- next_event_id uses SeqCst; Relaxed suffices for uniqueness and reduces contention.
- self.event_id_counter - .fetch_add(1, std::sync::atomic::Ordering::SeqCst) + self.event_id_counter + .fetch_add(1, std::sync::atomic::Ordering::Relaxed)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
lib/bindings/python/rust/llm/block_manager.rs
(3 hunks)lib/bindings/python/rust/llm/block_manager/vllm/connector/leader.rs
(1 hunks)lib/llm/src/block_manager/block/registry.rs
(1 hunks)lib/llm/src/block_manager/config.rs
(1 hunks)lib/llm/src/block_manager/events.rs
(1 hunks)lib/llm/src/block_manager/events/dynamo.rs
(0 hunks)lib/llm/src/block_manager/pool.rs
(1 hunks)lib/llm/src/block_manager/pool/managed.rs
(3 hunks)lib/llm/src/block_manager/state.rs
(1 hunks)lib/llm/src/block_manager/state/resources.rs
(1 hunks)lib/llm/src/kv_router/publisher.rs
(2 hunks)lib/llm/tests/block_manager.rs
(1 hunks)
💤 Files with no reviewable changes (1)
- lib/llm/src/block_manager/events/dynamo.rs
🧰 Additional context used
🧬 Code graph analysis (8)
lib/llm/src/block_manager/config.rs (3)
lib/llm/src/block_manager/events.rs (1)
publisher
(303-305)lib/llm/src/block_manager/pool/managed/state.rs (1)
publisher
(345-347)lib/bindings/python/src/dynamo/_core.pyi (1)
KvEventPublisher
(773-806)
lib/bindings/python/rust/llm/block_manager/vllm/connector/leader.rs (3)
lib/bindings/python/rust/llm/block_manager.rs (1)
kv_event_publisher
(295-301)lib/bindings/python/rust/lib.rs (1)
component
(706-712)lib/bindings/python/src/dynamo/_core.pyi (1)
KvEventPublisher
(773-806)
lib/llm/src/block_manager/state/resources.rs (2)
lib/bindings/python/rust/llm/block_manager.rs (3)
kv_event_publisher
(295-301)new
(89-202)new
(264-269)lib/llm/src/block_manager/events.rs (5)
new
(53-57)new
(90-95)new
(129-131)new
(177-182)new
(295-301)
lib/llm/src/block_manager/pool/managed.rs (3)
lib/llm/src/block_manager/state/resources.rs (1)
new
(8-86)lib/llm/src/block_manager/block/registry.rs (2)
new
(56-64)new
(81-123)lib/llm/src/block_manager/events.rs (5)
new
(53-57)new
(90-95)new
(129-131)new
(177-182)new
(295-301)
lib/bindings/python/rust/llm/block_manager.rs (2)
lib/bindings/python/src/dynamo/_core.pyi (1)
KvEventPublisher
(773-806)lib/llm/src/block_manager/config.rs (4)
builder
(42-44)builder
(84-86)builder
(131-133)builder
(213-215)
lib/llm/tests/block_manager.rs (2)
lib/llm/src/block_manager/block/registry.rs (5)
create_sequence
(270-288)new
(56-64)new
(81-123)drop
(68-70)drop
(258-260)lib/llm/src/block_manager/events.rs (4)
new_test
(187-198)drop
(69-73)drop
(114-116)publisher
(303-305)
lib/llm/src/block_manager/events.rs (3)
lib/bindings/python/src/dynamo/_core.pyi (1)
KvEventPublisher
(773-806)lib/llm/src/kv_router/publisher.rs (11)
mpsc
(125-125)mpsc
(1133-1133)mpsc
(1157-1157)publish
(169-171)publish
(772-785)publish
(1088-1099)new
(117-167)new
(714-747)new
(763-770)new
(920-922)new
(1075-1083)lib/llm/src/block_manager/state/resources.rs (1)
new
(8-86)
lib/llm/src/kv_router/publisher.rs (2)
lib/runtime/src/transports/nats.rs (1)
fmt
(365-374)lib/llm/src/block_manager/block.rs (3)
fmt
(629-642)fmt
(761-774)fmt
(1004-1013)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: trtllm (arm64)
- GitHub Check: trtllm (amd64)
- GitHub Check: vllm (arm64)
- GitHub Check: vllm (amd64)
- GitHub Check: sglang
- GitHub Check: Build and Test - dynamo
- GitHub Check: tests (launch/dynamo-run)
- GitHub Check: tests (lib/bindings/python)
- GitHub Check: tests (.)
- GitHub Check: tests (lib/runtime/examples)
- GitHub Check: Mirror Repository to GitLab
🔇 Additional comments (11)
lib/llm/src/block_manager/state.rs (1)
17-17
: Import narrowing to EventManager looks rightConsistent with centralizing Null/Dynamo selection elsewhere. No concerns.
lib/llm/src/block_manager/block/registry.rs (1)
233-244
: Scope reduction to crate-private is appropriateKeeps API surface tight while enabling internal use. Good change.
lib/llm/src/block_manager/pool.rs (1)
17-17
: Import cleanup LGTMAligns with event manager selection centralized in resources.
lib/llm/src/kv_router/publisher.rs (1)
54-63
: Debug implementations are safe and helpful
- KvEventSource::Debug redacts JoinHandle internals.
- KvEventPublisher derives Debug for better diagnostics.
Also applies to: 103-103
lib/llm/src/block_manager/state/resources.rs (1)
21-33
: EventManager resolution order is soundPrefers explicit manager, then Dynamo via kv_event_publisher, then Null. Clear logs. Good defaulting.
lib/llm/tests/block_manager.rs (1)
876-1056
: Solid end-to-end tests for Store/Remove pathsExercises single and multi-block publish and release via new_test. Timeouts are reasonable; assertions precise.
lib/bindings/python/rust/llm/block_manager.rs (2)
260-261
: Field added: OK, fits builder defaultsOptional Arc on the builder is reasonable and defaults to None via derive(Default). No issues.
316-327
: kv_event_publisher builder signature verified
KvManagerRuntimeConfigBuilder::kv_event_publisher acceptsOption<Arc<dynamo_llm::kv_router::publisher::KvEventPublisher>>
, matching the passed publisher.lib/bindings/python/rust/llm/block_manager/vllm/connector/leader.rs (1)
127-175
: Robust optional publisher wiring with clear logs — LGTM
- Reads namespace/component from env with sane fallbacks.
- Uses lease id for publisher worker_id; errors degrade gracefully.
Double-check KvEventPublisher::new signature (component, worker_id, page_size, Option<_>) matches the actual API (see script in the other comment).
lib/llm/src/block_manager/pool/managed.rs (1)
599-613
: Test helper mirrors production fallback — OKUsing NullEventManager in tests when none provided keeps tests isolated. Looks good.
lib/llm/src/block_manager/events.rs (1)
144-199
: DynamoEventManager skeleton looks solid
- Encapsulates publisher behind enum; provides real and test constructors.
- next_event_id monotonic; Debug impl fine.
#[builder(default)] | ||
pub kv_event_publisher: Option<Arc<crate::kv_router::publisher::KvEventPublisher>>, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
Add builder ergonomics for kv_event_publisher
Field addition looks good. For ease-of-use, allow passing Arc directly.
- #[builder(default)]
- pub kv_event_publisher: Option<Arc<crate::kv_router::publisher::KvEventPublisher>>,
+ /// Optional KV event publisher enabling real KV cache event emission.
+ #[builder(default, setter(strip_option))]
+ pub kv_event_publisher: Option<Arc<crate::kv_router::publisher::KvEventPublisher>>,
Run to confirm builder usage paths are updated (Python/leader wiring etc.):
🏁 Script executed:
#!/bin/bash
rg -n -C3 'kv_event_publisher\s*\('
Length of output: 3308
Add strip_option setter for kv_event_publisher and update callers
- In
config.rs
, change to#[builder(default, setter(strip_option))]
onkv_event_publisher
. - In
lib/bindings/python/rust/llm/block_manager.rs
andblock_manager/vllm/connector/leader.rs
, removeSome(…)
and pass the rawArc<KvEventPublisher>
to.kv_event_publisher(publisher)
(handleOption
beforehand).
/// Event manager for publishing block registration and removal events. | ||
/// | ||
/// **Defaults to `DynamoEventManager`** which logs KV cache events (STORE/REMOVE). | ||
/// | ||
/// To disable event logging, explicitly provide `NullEventManager::new()`. | ||
#[builder(default = "None", setter(strip_option))] | ||
event_manager: Option<Arc<dyn EventManager>>, | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docstring contradicts behavior (defaults to NullEventManager, not Dynamo)
Comments say default is DynamoEventManager, but code defaults to None here and is resolved to NullEventManager later in build(). Upstream selection of Dynamo happens in state/resources.rs when a kv_event_publisher exists. Please fix the doc to reflect this.
- /// **Defaults to `DynamoEventManager`** which logs KV cache events (STORE/REMOVE).
- ///
- /// To disable event logging, explicitly provide `NullEventManager::new()`.
+ /// If not provided here, the BlockManager will choose:
+ /// - `DynamoEventManager` when a `kv_event_publisher` is present in the runtime config, or
+ /// - `NullEventManager` (no event publishing) otherwise.
+ /// To force-disable logging, explicitly provide `NullEventManager::new()`.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
/// Event manager for publishing block registration and removal events. | |
/// | |
/// **Defaults to `DynamoEventManager`** which logs KV cache events (STORE/REMOVE). | |
/// | |
/// To disable event logging, explicitly provide `NullEventManager::new()`. | |
#[builder(default = "None", setter(strip_option))] | |
event_manager: Option<Arc<dyn EventManager>>, | |
/// Event manager for publishing block registration and removal events. | |
/// | |
/// If not provided here, the BlockManager will choose: | |
/// - `DynamoEventManager` when a `kv_event_publisher` is present in the runtime config, or | |
/// - `NullEventManager` (no event publishing) otherwise. | |
/// To force-disable logging, explicitly provide `NullEventManager::new()`. | |
#[builder(default = "None", setter(strip_option))] | |
event_manager: Option<Arc<dyn EventManager>>, |
🤖 Prompt for AI Agents
In lib/llm/src/block_manager/pool/managed.rs around lines 60 to 67, the
docstring wrongly states the default is DynamoEventManager; update it to state
that the builder field defaults to None (which results in a NullEventManager by
default) and that DynamoEventManager is selected upstream (in
state/resources.rs) when a kv_event_publisher exists; keep the note about
disabling event logging by explicitly providing NullEventManager::new() and
adjust wording to match the actual builder/default behavior.
WalkthroughAdds optional KV event publishing across the block manager stack. Introduces DynamoEventManager and integrates an optional KvEventPublisher into runtime config, builder, and vLLM leader. Selects EventManager at runtime based on provided manager or publisher. Replaces/removes legacy NATS-based event manager. Adds Debug for publisher types and new end-to-end tests. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Leader as KvConnectorLeader
participant Resolver as ComponentResolver
participant Pub as KvEventPublisher (optional)
participant Bldr as BlockManagerBuilder
participant RT as KvManagerRuntimeConfig
participant Pool as ManagedBlockPool
Leader->>Resolver: resolve(namespace, component)
alt Resolve OK
Resolver-->>Leader: Component
Leader->>Leader: derive worker_id, page_size
Leader->>Pub: create publisher (optional)
Note right of Leader: On error, skip publisher
else Resolve Fail
Leader->>Leader: publisher = None
end
Leader->>Bldr: kv_event_publisher(publisher)
Bldr->>RT: build() with optional kv_event_publisher
RT-->>Pool: runtime config
Pool->>Pool: select EventManager<br/>- Provided<br/>- else DynamoEventManager if RT.publisher<br/>- else NullEventManager
Pool-->>Leader: BlockManager ready
sequenceDiagram
autonumber
participant Client
participant BM as BlockManager
participant EM as DynamoEventManager
participant Pub as KvEventPublisher
participant Indexer as KV Indexer
Client->>BM: register blocks
BM->>EM: publish(handles)
EM->>Pub: publish Stored(KvCacheStoreData)
Pub-->>Indexer: event
Client--x BM: drop RegistrationHandle
BM->>EM: block_release(handle)
EM->>Pub: publish Removed(KvCacheRemoveData)
Pub-->>Indexer: event
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
Pre-merge checks❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
lib/llm/src/kv_router/publisher.rs (1)
456-475
: Bounds check to prevent slice panic on malformed eventsSlicing token_ids without verifying length can panic if payloads are inconsistent. Add a guard before slicing.
- let tokens = &token_ids[token_offset..(token_offset + *num_tokens_it as usize)]; + let end = token_offset + (*num_tokens_it as usize); + if end > token_ids.len() { + if warning_count.fetch_add(1, Ordering::Relaxed) < 3 { + tracing::warn!( + expected=%*num_tokens_it, + available=%(token_ids.len().saturating_sub(token_offset)), + "Insufficient token_ids for block; dropping remaining blocks" + ); + } + break; + } + let tokens = &token_ids[token_offset..end];
🧹 Nitpick comments (7)
lib/llm/src/kv_router/publisher.rs (3)
125-126
: Unbounded mpsc can OOM under backpressureZMQ/source → processor uses unbounded channel. If publisher (NATS) stalls, memory can grow unbounded. Prefer bounded mpsc with a backpressure strategy (drop oldest, coalesce batches, or spawn bounded worker pool).
Example change at creation site:
- let (tx, rx) = mpsc::unbounded_channel::<KvCacheEvent>(); + let (tx, rx) = mpsc::channel::<KvCacheEvent>(1024);And update publish signature to map SendError to TrySendError as needed.
Also applies to: 199-201
153-159
: Connect failure is silent to callers; consider surfacing readiness/errorsIf NATS connect fails, the task returns, rx is dropped, and future publish() calls error with SendError—hard to trace. Consider:
- Await connect before returning from new(), or
- Provide a readiness oneshot/flag, or
- Add retry/backoff on connect.
Minimal improvement:
- Emit a warn/error metric and store an internal “failed” state to log once per publish() error.
499-506
: Signed-to-u64 cast may admit negative valuesCasting negative i64 to u64 will wrap. If negatives are unexpected, reject or log-and-skip to avoid surprising hashes.
lib/llm/src/block_manager/config.rs (1)
37-39
: New kv_event_publisher field: OKOptional Arc is fine and backward-compatible. Consider documenting lifetimes/ownership and whether multiple managers may share one publisher. If coupling is a concern, you could hide the concrete type behind a trait in the future.
lib/llm/src/block_manager/state/resources.rs (1)
21-32
: Event manager selection logic looks solidExplicit > Dynamo (when publisher present) > Null fallback is clear. Consider downgrading these to debug to avoid noisy info logs during startup.
lib/llm/src/block_manager/pool/managed.rs (1)
96-102
: Explicit NullEventManager fallback is correct; consider logging choiceOptional: add a debug/info log stating which manager is used to aid troubleshooting.
- let event_manager = if let Some(event_manager) = event_manager_opt { + let event_manager = if let Some(event_manager) = event_manager_opt { event_manager } else { - crate::block_manager::events::NullEventManager::new() + tracing::info!("ManagedBlockPool: using NullEventManager (no event publishing)"); + crate::block_manager::events::NullEventManager::new() };lib/llm/src/block_manager/events.rs (1)
200-204
: SeqCst is unnecessary; Relaxed is sufficient for a monotonic counterNo cross-thread ordering needed beyond uniqueness.
- self.event_id_counter - .fetch_add(1, std::sync::atomic::Ordering::SeqCst) + self.event_id_counter + .fetch_add(1, std::sync::atomic::Ordering::Relaxed)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
lib/bindings/python/rust/llm/block_manager.rs
(3 hunks)lib/bindings/python/rust/llm/block_manager/vllm/connector/leader.rs
(1 hunks)lib/llm/src/block_manager/block/registry.rs
(1 hunks)lib/llm/src/block_manager/config.rs
(1 hunks)lib/llm/src/block_manager/events.rs
(1 hunks)lib/llm/src/block_manager/events/dynamo.rs
(0 hunks)lib/llm/src/block_manager/pool.rs
(1 hunks)lib/llm/src/block_manager/pool/managed.rs
(3 hunks)lib/llm/src/block_manager/state.rs
(1 hunks)lib/llm/src/block_manager/state/resources.rs
(1 hunks)lib/llm/src/kv_router/publisher.rs
(2 hunks)lib/llm/tests/block_manager.rs
(1 hunks)
💤 Files with no reviewable changes (1)
- lib/llm/src/block_manager/events/dynamo.rs
🧰 Additional context used
🧬 Code graph analysis (8)
lib/llm/src/block_manager/config.rs (3)
lib/llm/src/block_manager/events.rs (1)
publisher
(303-305)lib/llm/src/block_manager/pool/managed/state.rs (1)
publisher
(345-347)lib/bindings/python/src/dynamo/_core.pyi (1)
KvEventPublisher
(773-806)
lib/llm/src/block_manager/state/resources.rs (2)
lib/bindings/python/rust/llm/block_manager.rs (3)
kv_event_publisher
(295-301)new
(89-202)new
(264-269)lib/llm/src/block_manager/events.rs (5)
new
(53-57)new
(90-95)new
(129-131)new
(177-182)new
(295-301)
lib/bindings/python/rust/llm/block_manager.rs (2)
lib/bindings/python/src/dynamo/_core.pyi (1)
KvEventPublisher
(773-806)lib/llm/src/block_manager/config.rs (4)
builder
(42-44)builder
(84-86)builder
(131-133)builder
(213-215)
lib/llm/src/block_manager/pool/managed.rs (2)
lib/llm/src/block_manager/state/resources.rs (1)
new
(8-86)lib/llm/src/block_manager/events.rs (5)
new
(53-57)new
(90-95)new
(129-131)new
(177-182)new
(295-301)
lib/llm/src/kv_router/publisher.rs (2)
lib/runtime/src/transports/nats.rs (1)
fmt
(365-374)lib/llm/src/block_manager/block.rs (3)
fmt
(629-642)fmt
(761-774)fmt
(1004-1013)
lib/llm/tests/block_manager.rs (3)
lib/runtime/src/worker.rs (2)
tokio
(147-147)from_current
(198-205)lib/llm/src/block_manager/block/registry.rs (5)
create_sequence
(270-288)new
(56-64)new
(81-123)drop
(68-70)drop
(258-260)lib/llm/src/block_manager/events.rs (4)
new_test
(187-198)drop
(69-73)drop
(114-116)publisher
(303-305)
lib/bindings/python/rust/llm/block_manager/vllm/connector/leader.rs (3)
lib/bindings/python/rust/llm/block_manager.rs (1)
kv_event_publisher
(295-301)lib/bindings/python/rust/lib.rs (1)
component
(706-712)lib/bindings/python/src/dynamo/_core.pyi (1)
KvEventPublisher
(773-806)
lib/llm/src/block_manager/events.rs (3)
lib/llm/src/kv_router/publisher.rs (11)
mpsc
(125-125)mpsc
(1133-1133)mpsc
(1157-1157)publish
(169-171)publish
(772-785)publish
(1088-1099)new
(117-167)new
(714-747)new
(763-770)new
(920-922)new
(1075-1083)lib/llm/src/kv_router/indexer.rs (8)
mpsc
(797-797)mpsc
(798-798)mpsc
(799-799)mpsc
(800-800)mpsc
(1048-1048)mpsc
(1050-1050)mpsc
(1051-1051)new
(193-195)lib/llm/src/block_manager/state/resources.rs (1)
new
(8-86)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: trtllm (arm64)
- GitHub Check: trtllm (amd64)
- GitHub Check: vllm (arm64)
- GitHub Check: vllm (amd64)
- GitHub Check: sglang
- GitHub Check: Build and Test - dynamo
- GitHub Check: clippy (launch/dynamo-run)
- GitHub Check: clippy (.)
- GitHub Check: tests (launch/dynamo-run)
- GitHub Check: tests (lib/bindings/python)
- GitHub Check: clippy (lib/bindings/python)
- GitHub Check: tests (.)
- GitHub Check: tests (lib/runtime/examples)
- GitHub Check: Mirror Repository to GitLab
🔇 Additional comments (17)
lib/llm/src/kv_router/publisher.rs (2)
54-63
: Good: custom Debug with redactionCustom Debug avoids leaking JoinHandle internals. Matches existing style in transports.
103-114
: Deriving Debug for KvEventPublisher is fineAll fields implement Debug; no sensitive data printed.
lib/llm/src/block_manager/block/registry.rs (1)
233-245
: Visibility tweak looks rightpub(crate) narrows exposure while enabling crate-internal constructors. No functional change.
lib/llm/src/block_manager/pool.rs (1)
17-17
: Import update is consistent with new event plumbingSwitching to EventManager keeps surface generic and aligns with refactor.
lib/llm/src/block_manager/state.rs (1)
17-17
: Import aligns with new event manager selectionNo behavioral change here.
lib/llm/tests/block_manager.rs (1)
877-1056
: Tests cover key event emission pathsGood coverage for store/remove and multi-block batching via test-mode DynamoEventManager.
lib/llm/src/block_manager/pool/managed.rs (3)
88-94
: Rename aligns with Option; LGTM
599-606
: Args dissolve update is consistent with optional event_manager
607-613
: Test builder fallback mirrors production path; LGTMlib/bindings/python/rust/llm/block_manager.rs (3)
260-261
: New field for KvEventPublisher wiring looks goodPublic surface addition is reasonable and scoped.
295-301
: Setter API is fine; accepts Option for explicit NoneNo issues.
316-327
: Approve runtime config publisher propagation
VerifiedKvManagerRuntimeConfig
has akv_event_publisher
field and the builder exposes its setter (via derive macro); conditional injection correctly preserves behavior whenNone
.lib/bindings/python/rust/llm/block_manager/vllm/connector/leader.rs (1)
127-176
: Robust optional publisher creation; good error handlingEnvironment defaults + lease-derived worker_id and graceful fallback are sensible.
lib/llm/src/block_manager/events.rs (4)
144-150
: Introduction of DynamoEventManager is clearStruct design is straightforward and object-safe.
175-183
: Constructor with real publisher: LGTM
186-199
: Test constructor is useful for E2E verificationGood testability hook.
255-277
: Remove event emission: LGTMSingle-block RAII removal is appropriate.
impl EventPublisher for DynamoEventManager { | ||
fn publish(&self, handles: Vec<Arc<RegistrationHandle>>) { | ||
use crate::kv_router::protocols::{ | ||
ExternalSequenceBlockHash, KvCacheEvent, KvCacheEventData, KvCacheStoreData, | ||
KvCacheStoredBlockData, LocalBlockHash, | ||
}; | ||
|
||
if handles.is_empty() { | ||
return; | ||
} | ||
|
||
let parent_hash = handles | ||
.first() | ||
.and_then(|h| h.parent_sequence_hash()) | ||
.map(|hash| ExternalSequenceBlockHash(hash)); | ||
|
||
let blocks: Vec<KvCacheStoredBlockData> = handles | ||
.iter() | ||
.map(|handle| KvCacheStoredBlockData { | ||
block_hash: ExternalSequenceBlockHash(handle.sequence_hash()), | ||
tokens_hash: LocalBlockHash(handle.block_hash()), | ||
}) | ||
.collect(); | ||
|
||
let store_data = KvCacheStoreData { | ||
parent_hash, | ||
blocks, | ||
}; | ||
|
||
let event = KvCacheEvent { | ||
event_id: self.next_event_id(), | ||
data: KvCacheEventData::Stored(store_data.clone()), | ||
}; | ||
|
||
// Publish to the indexer | ||
if let Err(e) = self.publisher.publish(event) { | ||
tracing::error!("Failed to publish STORED event to indexer: {}", e); | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Batch publish assumes a single parent; split by parent to avoid malformed events
If handles span different parent hashes, emitting one STORED event is ambiguous. Group by parent_hash and emit one event per group.
Apply this refactor sketch:
- let parent_hash = handles
- .first()
- .and_then(|h| h.parent_sequence_hash())
- .map(|hash| ExternalSequenceBlockHash(hash));
-
- let blocks: Vec<KvCacheStoredBlockData> = handles
- .iter()
- .map(|handle| KvCacheStoredBlockData {
- block_hash: ExternalSequenceBlockHash(handle.sequence_hash()),
- tokens_hash: LocalBlockHash(handle.block_hash()),
- })
- .collect();
-
- let store_data = KvCacheStoreData { parent_hash, blocks };
- let event = KvCacheEvent { event_id: self.next_event_id(), data: KvCacheEventData::Stored(store_data.clone()) };
- if let Err(e) = self.publisher.publish(event) { ... }
+ use std::collections::BTreeMap;
+ let mut groups: BTreeMap<Option<ExternalSequenceBlockHash>, Vec<KvCacheStoredBlockData>> =
+ BTreeMap::new();
+ for handle in handles {
+ let key = handle.parent_sequence_hash().map(ExternalSequenceBlockHash);
+ groups.entry(key).or_default().push(KvCacheStoredBlockData {
+ block_hash: ExternalSequenceBlockHash(handle.sequence_hash()),
+ tokens_hash: LocalBlockHash(handle.block_hash()),
+ });
+ }
+ for (parent_hash, blocks) in groups {
+ let store_data = KvCacheStoreData { parent_hash, blocks };
+ let event = KvCacheEvent {
+ event_id: self.next_event_id(),
+ data: KvCacheEventData::Stored(store_data),
+ };
+ if let Err(e) = self.publisher.publish(event) {
+ tracing::error!("Failed to publish STORED event to indexer: {}", e);
+ }
+ }
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
impl EventPublisher for DynamoEventManager { | |
fn publish(&self, handles: Vec<Arc<RegistrationHandle>>) { | |
use crate::kv_router::protocols::{ | |
ExternalSequenceBlockHash, KvCacheEvent, KvCacheEventData, KvCacheStoreData, | |
KvCacheStoredBlockData, LocalBlockHash, | |
}; | |
if handles.is_empty() { | |
return; | |
} | |
let parent_hash = handles | |
.first() | |
.and_then(|h| h.parent_sequence_hash()) | |
.map(|hash| ExternalSequenceBlockHash(hash)); | |
let blocks: Vec<KvCacheStoredBlockData> = handles | |
.iter() | |
.map(|handle| KvCacheStoredBlockData { | |
block_hash: ExternalSequenceBlockHash(handle.sequence_hash()), | |
tokens_hash: LocalBlockHash(handle.block_hash()), | |
}) | |
.collect(); | |
let store_data = KvCacheStoreData { | |
parent_hash, | |
blocks, | |
}; | |
let event = KvCacheEvent { | |
event_id: self.next_event_id(), | |
data: KvCacheEventData::Stored(store_data.clone()), | |
}; | |
// Publish to the indexer | |
if let Err(e) = self.publisher.publish(event) { | |
tracing::error!("Failed to publish STORED event to indexer: {}", e); | |
} | |
} | |
} | |
impl EventPublisher for DynamoEventManager { | |
fn publish(&self, handles: Vec<Arc<RegistrationHandle>>) { | |
use crate::kv_router::protocols::{ | |
ExternalSequenceBlockHash, KvCacheEvent, KvCacheEventData, KvCacheStoreData, | |
KvCacheStoredBlockData, LocalBlockHash, | |
}; | |
use std::collections::BTreeMap; | |
if handles.is_empty() { | |
return; | |
} | |
let mut groups: BTreeMap<Option<ExternalSequenceBlockHash>, Vec<KvCacheStoredBlockData>> = | |
BTreeMap::new(); | |
for handle in handles { | |
let key = handle.parent_sequence_hash().map(ExternalSequenceBlockHash); | |
groups.entry(key).or_default().push(KvCacheStoredBlockData { | |
block_hash: ExternalSequenceBlockHash(handle.sequence_hash()), | |
tokens_hash: LocalBlockHash(handle.block_hash()), | |
}); | |
} | |
for (parent_hash, blocks) in groups { | |
let store_data = KvCacheStoreData { parent_hash, blocks }; | |
let event = KvCacheEvent { | |
event_id: self.next_event_id(), | |
data: KvCacheEventData::Stored(store_data), | |
}; | |
if let Err(e) = self.publisher.publish(event) { | |
tracing::error!("Failed to publish STORED event to indexer: {}", e); | |
} | |
} | |
} | |
} |
🤖 Prompt for AI Agents
lib/llm/src/block_manager/events.rs lines 214-253: the current publish
implementation assumes all handles share a single parent and emits one STORED
event for all handles, which can produce malformed/ambiguous events when handles
have different parent_sequence_hashes; change the logic to group the handles by
their parent_sequence_hash (treating None as a separate group), then for each
group build a Vec<KvCacheStoredBlockData> from that group's handles, create a
KvCacheStoreData with the group's parent_hash mapped to
ExternalSequenceBlockHash when Some or None otherwise, construct a KvCacheEvent
per group (with event_id from next_event_id and
data::Stored(store_data.clone())) and call self.publisher.publish for each
event, logging errors individually; keep the early return for empty handles.
@krishung5 thanks! Do you mind having a look at the rabbit comments and address them if they make sense |
|
||
/// Event manager that emits KV cache events to the indexer. | ||
pub struct DynamoEventManager { | ||
event_id_counter: std::sync::atomic::AtomicU64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit. But there are several inline imports like std::sync
crate::kv_router::publisher
etc, can we move them to top?
I think it generally LGTM, maybe outside the scope here, but if possible, can we run some e2e tests with vllm engines using kvbm, and send some requests to it (with some shared system prompt), and see the router outputs in the terminal make sense (and no warnings) |
Overview:
Created a DynamoEventManager to emit kv events from KVBM. Enables for only vllm for now.
Details:
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit