Skip to content

add vLLM-side LMCache EC connector entrypoint#38668

Draft
benyebai wants to merge 1 commit intovllm-project:mainfrom
benyebai:fix/lmcache-ec-connector-module-clean
Draft

add vLLM-side LMCache EC connector entrypoint#38668
benyebai wants to merge 1 commit intovllm-project:mainfrom
benyebai:fix/lmcache-ec-connector-module-clean

Conversation

@benyebai
Copy link
Copy Markdown

Purpose

This PR adds a vLLM-side LMCache EC connector entrypoint under the EC connector package, instead of requiring connector class ownership in LMCache.
Specifically:

  • adds LMCacheECConnector at:
    • vllm/distributed/ec_transfer/ec_connector/lmcache_connector.py
  • registers it in:
    • vllm/distributed/ec_transfer/ec_connector/factory.py
      Why:
  • align EC connector ownership with vLLM connector architecture (similar to KV connector pattern)
  • keep vLLM as the source of connector entrypoints/interfaces
  • allow runtime config to use a vLLM module path for LMCache EC:
    • vllm.distributed.ec_transfer.ec_connector.lmcache_connector
      No behavior change is intended beyond connector module placement/registration.

Test Plan

  1. Static check (syntax):
    python3 -m py_compile \
      vllm/distributed/ec_transfer/ec_connector/lmcache_connector.py \
      vllm/distributed/ec_transfer/ec_connector/factory.py
  2. Manual startup check with EC config:
    • launch vLLM with:
      • "ec_connector": "LMCacheECConnector"
      • "ec_connector_module_path": "vllm.distributed.ec_transfer.ec_connector.lmcache_connector"
    • verify connector creation logs show LMCacheECConnector is resolved and instantiated.
  3. Functional sanity (manual):
    • run text-only and multimodal requests with EC enabled to confirm no import/connector wiring regressions.
      Test Result
  • ✅ Connector module imports successfully.
  • ✅ Factory registration includes LMCacheECConnector.
  • ✅ Local syntax check passes for both modified files.
  • ✅ Branch contains only the expected connector wiring changes:
    • added lmcache_connector.py
    • updated factory.py

Essential Elements of an Effective PR Description Checklist - [x] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)". - [x] The test plan, such as providing test command. - [x] The test results, such as pasting the results comparison before and after, or e2e results - [ ] (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. - [ ] (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc (https://docs.google.com/document/d/1YyVqrgX4gHTtrstbq8oWUImOyPCKSGnJ7xtTpmXzlRs/edit?tab=t.0).
BEFORE SUBMITTING, PLEASE READ (anything written below this line will be removed by GitHub Actions)

@mergify mergify bot added the kv-connector label Mar 31, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the LMCacheECConnector to the vLLM distributed EC transfer system, enabling integration with the lmcache package. The implementation includes registering the new connector in the factory and creating a wrapper class that delegates calls to an underlying implementation. Review feedback highlights the need to handle lmcache as an optional dependency to prevent import failures when the package is missing and identifies several missing lifecycle method delegations that are necessary to avoid potential resource leaks or race conditions.

)
from vllm.v1.core.sched.output import SchedulerOutput

from lmcache.integration.vllm.vllm_ec_adapter import LMCacheECConnectorImpl
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The top-level import of lmcache creates a hard dependency on an external package that is not a mandatory dependency of vLLM. If lmcache is not installed, this module will fail to import, which can break module discovery or static analysis tools. It is recommended to handle this optional dependency gracefully.

Suggested change
from lmcache.integration.vllm.vllm_ec_adapter import LMCacheECConnectorImpl
try:
from lmcache.integration.vllm.vllm_ec_adapter import LMCacheECConnectorImpl
except ImportError:
LMCacheECConnectorImpl = None

Comment on lines +23 to +29
def __init__(self, vllm_config: VllmConfig, role: ECConnectorRole):
super().__init__(vllm_config=vllm_config, role=role)
self._impl = LMCacheECConnectorImpl(
vllm_config=vllm_config,
role=role,
parent=self,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Since lmcache is an optional dependency, you should verify its presence before attempting to instantiate the implementation class. This provides a much clearer error message to the user if the package is missing.

Suggested change
def __init__(self, vllm_config: VllmConfig, role: ECConnectorRole):
super().__init__(vllm_config=vllm_config, role=role)
self._impl = LMCacheECConnectorImpl(
vllm_config=vllm_config,
role=role,
parent=self,
)
def __init__(self, vllm_config: VllmConfig, role: ECConnectorRole):
super().__init__(vllm_config=vllm_config, role=role)
if LMCacheECConnectorImpl is None:
raise ImportError(
"LMCacheECConnector requires the 'lmcache' package. "
"Please install it with `pip install lmcache`.")
self._impl = LMCacheECConnectorImpl(
vllm_config=vllm_config,
role=role,
parent=self,
)

Comment on lines +22 to +53
class LMCacheECConnector(ECConnectorBase):
def __init__(self, vllm_config: VllmConfig, role: ECConnectorRole):
super().__init__(vllm_config=vllm_config, role=role)
self._impl = LMCacheECConnectorImpl(
vllm_config=vllm_config,
role=role,
parent=self,
)

def start_load_caches(
self, encoder_cache: dict[str, torch.Tensor], **kwargs: Any
) -> None:
return self._impl.start_load_caches(encoder_cache, **kwargs)

def save_caches(
self,
encoder_cache: dict[str, torch.Tensor],
mm_hash: str,
**kwargs: Any,
) -> None:
return self._impl.save_caches(encoder_cache, mm_hash, **kwargs)

def has_cache_item(self, identifier: str) -> bool:
return self._impl.has_cache_item(identifier)

def update_state_after_alloc(self, request: "Request", index: int) -> None:
return self._impl.update_state_after_alloc(request, index)

def build_connector_meta(
self, scheduler_output: SchedulerOutput
) -> ECConnectorMetadata:
return self._impl.build_connector_meta(scheduler_output)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The LMCacheECConnector class is missing delegations for several lifecycle methods defined in ECConnectorBase, including register_caches, get_finished, update_connector_output, and request_finished.

Specifically, request_finished is critical for signaling when an asynchronous transfer is complete and the cache can be safely freed. By not overriding it, the connector defaults to the base implementation which returns False, potentially leading to race conditions or resource leaks if LMCache expects to manage the cache lifecycle. Ensure all methods from the base class are properly delegated to self._impl.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant