Skip to content

[Bugfix] Fix memory leak and broken metrics in UCMBlendConnector#904

Open
SuperMarioYL wants to merge 1 commit intoModelEngine-Group:developfrom
SuperMarioYL:fix/blend-connector-memory-leak-and-metrics
Open

[Bugfix] Fix memory leak and broken metrics in UCMBlendConnector#904
SuperMarioYL wants to merge 1 commit intoModelEngine-Group:developfrom
SuperMarioYL:fix/blend-connector-memory-leak-and-metrics

Conversation

@SuperMarioYL
Copy link
Copy Markdown
Contributor

Purpose

Fix two functional bugs in UCMBlendConnector that affect production serving reliability.

Modifications

Bug 1: requests_blend_meta memory leak

build_connector_meta cleans up self.requests_meta for finished requests (line 466) but never cleans up self.requests_blend_meta. Since get_num_new_matched_tokens adds an entry to requests_blend_meta for every request (line 399), this dict grows unboundedly for the lifetime of the process.

Each BlendRequestMeta holds block hashes, ChunkMetaData objects, and stage info, so in long-running serving scenarios the memory impact is significant.

Fix: Add self.requests_blend_meta.pop(request_id, None) alongside the existing cleanup.

Bug 2: self.monitor AttributeError crashes metrics path

When self.metrics_config is enabled, the blend connector calls self.monitor.update_stats("ConnStats", {...}) (line 386). However, self.monitor is never defined — the parent class UCMDirectConnector uses the module-level ucmmetrics.update_stats({...}) API instead.

This means any blend request with metrics enabled raises AttributeError: 'UCMBlendConnector' object has no attribute 'monitor'.

Fix: Replace self.monitor.update_stats(...) with ucmmetrics.update_stats(...) to match the parent class pattern.

Test

  • black --check and isort --check --profile=black pass on the modified file.
  • Both fixes are mechanical corrections to match the parent class behavior.

@SuperMarioYL
Copy link
Copy Markdown
Contributor Author

The CI failure in lint-and-unit-tests / cpp_gtest is in the UCLoggerPerfTest.MultiThreadPerfUCInfoVsRateLimit performance test, which appears to be a pre-existing flaky test unrelated to this PR's changes (this PR only modifies Python files in ucm/integration/vllm/blend_connector.py). Could a maintainer trigger a CI re-run? Happy to make any code adjustments if needed.

Two functional bugs in the blend connector:

1. Memory leak: requests_blend_meta is never cleaned up for finished
   requests. build_connector_meta pops from self.requests_meta but
   not from self.requests_blend_meta, causing unbounded growth
   proportional to the total number of completed requests. Each entry
   holds BlendRequestMeta with block hashes and chunk metadata, so
   this is not trivial in long-running serving scenarios.

2. Broken metrics: self.monitor.update_stats() references an undefined
   attribute. The parent class UCMDirectConnector uses the module-level
   ucmmetrics.update_stats() API, but the blend connector calls
   self.monitor which does not exist, causing AttributeError on any
   blend request with metrics enabled.

Signed-off-by: supermario_leo <leo.stack@outlook.com>
@SuperMarioYL SuperMarioYL force-pushed the fix/blend-connector-memory-leak-and-metrics branch from 3bbe2d1 to 91aa440 Compare April 24, 2026 02:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant