Skip to content

Commit 3bbe2d1

Browse files
committed
[Bugfix] Fix memory leak and broken metrics in UCMBlendConnector
Two functional bugs in the blend connector: 1. Memory leak: requests_blend_meta is never cleaned up for finished requests. build_connector_meta pops from self.requests_meta but not from self.requests_blend_meta, causing unbounded growth proportional to the total number of completed requests. Each entry holds BlendRequestMeta with block hashes and chunk metadata, so this is not trivial in long-running serving scenarios. 2. Broken metrics: self.monitor.update_stats() references an undefined attribute. The parent class UCMDirectConnector uses the module-level ucmmetrics.update_stats() API, but the blend connector calls self.monitor which does not exist, causing AttributeError on any blend request with metrics enabled. Signed-off-by: supermario_leo <leo.stack@outlook.com>
1 parent f531458 commit 3bbe2d1

1 file changed

Lines changed: 3 additions & 2 deletions

File tree

ucm/integration/vllm/blend_connector.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818
UCMDirectConnector,
1919
)
2020
from ucm.logger import init_logger
21+
from ucm.shared.metrics import ucmmetrics
2122
from ucm.sparse.blend.blockwise_rope import block_wise_rope_forward
2223

2324
if TYPE_CHECKING:
@@ -381,8 +382,7 @@ def get_num_new_matched_tokens(
381382
f"chunks cache total hit: {chunk_hit_blocks}, "
382383
)
383384
if self.metrics_config:
384-
self.monitor.update_stats(
385-
"ConnStats",
385+
ucmmetrics.update_stats(
386386
{"interval_lookup_hit_rates": chunk_hit_blocks / max_blk_num},
387387
)
388388

@@ -462,6 +462,7 @@ def build_connector_meta(
462462
# clear finished request
463463
for request_id in scheduler_output.finished_req_ids:
464464
self.requests_meta.pop(request_id, None)
465+
self.requests_blend_meta.pop(request_id, None)
465466

466467
return UCMBlendConnectorMetadata(requests_dispatch_meta)
467468

0 commit comments

Comments
 (0)