add vLLM-side LMCache EC connector entrypoint#38668
add vLLM-side LMCache EC connector entrypoint#38668benyebai wants to merge 1 commit intovllm-project:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces the LMCacheECConnector to the vLLM distributed EC transfer system, enabling integration with the lmcache package. The implementation includes registering the new connector in the factory and creating a wrapper class that delegates calls to an underlying implementation. Review feedback highlights the need to handle lmcache as an optional dependency to prevent import failures when the package is missing and identifies several missing lifecycle method delegations that are necessary to avoid potential resource leaks or race conditions.
| ) | ||
| from vllm.v1.core.sched.output import SchedulerOutput | ||
|
|
||
| from lmcache.integration.vllm.vllm_ec_adapter import LMCacheECConnectorImpl |
There was a problem hiding this comment.
The top-level import of lmcache creates a hard dependency on an external package that is not a mandatory dependency of vLLM. If lmcache is not installed, this module will fail to import, which can break module discovery or static analysis tools. It is recommended to handle this optional dependency gracefully.
| from lmcache.integration.vllm.vllm_ec_adapter import LMCacheECConnectorImpl | |
| try: | |
| from lmcache.integration.vllm.vllm_ec_adapter import LMCacheECConnectorImpl | |
| except ImportError: | |
| LMCacheECConnectorImpl = None |
| def __init__(self, vllm_config: VllmConfig, role: ECConnectorRole): | ||
| super().__init__(vllm_config=vllm_config, role=role) | ||
| self._impl = LMCacheECConnectorImpl( | ||
| vllm_config=vllm_config, | ||
| role=role, | ||
| parent=self, | ||
| ) |
There was a problem hiding this comment.
Since lmcache is an optional dependency, you should verify its presence before attempting to instantiate the implementation class. This provides a much clearer error message to the user if the package is missing.
| def __init__(self, vllm_config: VllmConfig, role: ECConnectorRole): | |
| super().__init__(vllm_config=vllm_config, role=role) | |
| self._impl = LMCacheECConnectorImpl( | |
| vllm_config=vllm_config, | |
| role=role, | |
| parent=self, | |
| ) | |
| def __init__(self, vllm_config: VllmConfig, role: ECConnectorRole): | |
| super().__init__(vllm_config=vllm_config, role=role) | |
| if LMCacheECConnectorImpl is None: | |
| raise ImportError( | |
| "LMCacheECConnector requires the 'lmcache' package. " | |
| "Please install it with `pip install lmcache`.") | |
| self._impl = LMCacheECConnectorImpl( | |
| vllm_config=vllm_config, | |
| role=role, | |
| parent=self, | |
| ) |
| class LMCacheECConnector(ECConnectorBase): | ||
| def __init__(self, vllm_config: VllmConfig, role: ECConnectorRole): | ||
| super().__init__(vllm_config=vllm_config, role=role) | ||
| self._impl = LMCacheECConnectorImpl( | ||
| vllm_config=vllm_config, | ||
| role=role, | ||
| parent=self, | ||
| ) | ||
|
|
||
| def start_load_caches( | ||
| self, encoder_cache: dict[str, torch.Tensor], **kwargs: Any | ||
| ) -> None: | ||
| return self._impl.start_load_caches(encoder_cache, **kwargs) | ||
|
|
||
| def save_caches( | ||
| self, | ||
| encoder_cache: dict[str, torch.Tensor], | ||
| mm_hash: str, | ||
| **kwargs: Any, | ||
| ) -> None: | ||
| return self._impl.save_caches(encoder_cache, mm_hash, **kwargs) | ||
|
|
||
| def has_cache_item(self, identifier: str) -> bool: | ||
| return self._impl.has_cache_item(identifier) | ||
|
|
||
| def update_state_after_alloc(self, request: "Request", index: int) -> None: | ||
| return self._impl.update_state_after_alloc(request, index) | ||
|
|
||
| def build_connector_meta( | ||
| self, scheduler_output: SchedulerOutput | ||
| ) -> ECConnectorMetadata: | ||
| return self._impl.build_connector_meta(scheduler_output) |
There was a problem hiding this comment.
The LMCacheECConnector class is missing delegations for several lifecycle methods defined in ECConnectorBase, including register_caches, get_finished, update_connector_output, and request_finished.
Specifically, request_finished is critical for signaling when an asynchronous transfer is complete and the cache can be safely freed. By not overriding it, the connector defaults to the base implementation which returns False, potentially leading to race conditions or resource leaks if LMCache expects to manage the cache lifecycle. Ensure all methods from the base class are properly delegated to self._impl.
Purpose
This PR adds a vLLM-side LMCache EC connector entrypoint under the EC connector package, instead of requiring connector class ownership in LMCache.
Specifically:
LMCacheECConnectorat:vllm/distributed/ec_transfer/ec_connector/lmcache_connector.pyvllm/distributed/ec_transfer/ec_connector/factory.pyWhy:
vllm.distributed.ec_transfer.ec_connector.lmcache_connectorNo behavior change is intended beyond connector module placement/registration.
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
- [x] The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)". - [x] The test plan, such as providing test command. - [x] The test results, such as pasting the results comparison before and after, or e2e results - [ ] (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model. - [ ] (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc (https://docs.google.com/document/d/1YyVqrgX4gHTtrstbq8oWUImOyPCKSGnJ7xtTpmXzlRs/edit?tab=t.0).