-
Notifications
You must be signed in to change notification settings - Fork 326
Description
Describe the bug
Hi team, i found that example of shared storage seems not working anymore, and consistently failed at the initContainer wait-for-cache-server due to an unknown format code, could be related to a bug in lmcache/vllm-openai:latest-nightly ? is there a more stable version tag that can be used instead of using latest image?
Server side:
cache-server's lmcache-server container logs
/opt/venv/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report t
import pynvml # type: ignore[import]
[2025-10-06 00:48:17,743] LMCache INFO: Initializing cpu-only cache server (__init__.py:14:lmcache.v1.server.storage_backend)
[2025-10-06 00:48:17,743] LMCache INFO: Server started at 0.0.0.0:8080 (__main__.py:138:lmcache.v1.server.__main__)
[2025-10-06 00:48:20,901] LMCache INFO: Connected by ('10.181.98.109', 53200) (__main__.py:142:lmcache.v1.server.__main__)
[2025-10-06 00:48:20,902] LMCache INFO: Client disconnected (__main__.py:134:lmcache.v1.server.__main__)Client side:
mistral-deployment's wait-for-cache-server initContainer logs
Waiting for LMCache server...
/opt/venv/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report t
import pynvml # type: ignore[import]
Error during health check: Unknown format code 'x' for object of type 'str'
Waiting for LMCache server...
/opt/venv/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report t
import pynvml # type: ignore[import]
Error during health check: Unknown format code 'x' for object of type 'str'To Reproduce
Following tutorial: https://github.com/vllm-project/production-stack/blob/main/tutorials/assets/values-06-shared-storage.yaml
$ helm install vllm-shared-storage vllm/vllm-stack -f tutorials/assets/values-06-shared-storage.yamlExpected behavior
Able to run the tutorial without issue, this seems to be a blocker for production use case where cannot leverage the benefits of LMCache remote cache sharing
Additional context
Confirmed that can make connection outside of server, so not related to dns resolution issue, it's that the way health check is not working
root@vllm-shared-storage-deployment-router-6ff6794767-z7bdg:/app# curl -v vllm-shared-storage-cache-server-service:81
* Host vllm-shared-storage-cache-server-service:81 was resolved.
* IPv6: (none)
* IPv4: 172.20.211.86
* Trying 172.20.211.86:81...
* Connected to vllm-shared-storage-cache-server-service (172.20.211.86) port 81
* using HTTP/1.x
> GET / HTTP/1.1
> Host: vllm-shared-storage-cache-server-service:81
> User-Agent: curl/8.14.1
> Accept: */*
>
* Request completely sent off
... hang here ...