bug: lmcache remote server consistently disconnected from client side

### Describe the bug

Hi team, i found that [example of shared storage](https://github.com/vllm-project/production-stack/blob/main/tutorials/assets/values-06-shared-storage.yaml) seems not working anymore, and consistently failed at the initContainer `wait-for-cache-server` due to an unknown format code, could be related to a bug in `lmcache/vllm-openai:latest-nightly` ? is there a more stable version tag that can be used instead of using latest image?

### Server side: 
cache-server's lmcache-server container logs
```bash
/opt/venv/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report t
  import pynvml  # type: ignore[import]
[2025-10-06 00:48:17,743] LMCache INFO: Initializing cpu-only cache server (__init__.py:14:lmcache.v1.server.storage_backend)
[2025-10-06 00:48:17,743] LMCache INFO: Server started at 0.0.0.0:8080 (__main__.py:138:lmcache.v1.server.__main__)
[2025-10-06 00:48:20,901] LMCache INFO: Connected by ('10.181.98.109', 53200) (__main__.py:142:lmcache.v1.server.__main__)
[2025-10-06 00:48:20,902] LMCache INFO: Client disconnected (__main__.py:134:lmcache.v1.server.__main__)
```

### Client side:
mistral-deployment's wait-for-cache-server initContainer logs
```bash
 Waiting for LMCache server...
 /opt/venv/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report t
   import pynvml  # type: ignore[import]
 Error during health check: Unknown format code 'x' for object of type 'str'
 Waiting for LMCache server...
 /opt/venv/lib/python3.12/site-packages/torch/cuda/__init__.py:63: FutureWarning: The pynvml package is deprecated. Please install nvidia-ml-py instead. If you did not install pynvml directly, please report t
   import pynvml  # type: ignore[import]
 Error during health check: Unknown format code 'x' for object of type 'str'
```

### To Reproduce

Following tutorial: https://github.com/vllm-project/production-stack/blob/main/tutorials/assets/values-06-shared-storage.yaml

```bash
$ helm install vllm-shared-storage vllm/vllm-stack -f tutorials/assets/values-06-shared-storage.yaml
```

### Expected behavior

Able to run the tutorial without issue, this seems to be a blocker for production use case where cannot leverage the benefits of LMCache remote cache sharing

### Additional context

Confirmed that can make connection outside of server, so not related to dns resolution issue, it's that the way health check is not working 
```bash
root@vllm-shared-storage-deployment-router-6ff6794767-z7bdg:/app# curl -v vllm-shared-storage-cache-server-service:81
* Host vllm-shared-storage-cache-server-service:81 was resolved.
* IPv6: (none)
* IPv4: 172.20.211.86
*   Trying 172.20.211.86:81...
* Connected to vllm-shared-storage-cache-server-service (172.20.211.86) port 81
* using HTTP/1.x
> GET / HTTP/1.1
> Host: vllm-shared-storage-cache-server-service:81
> User-Agent: curl/8.14.1
> Accept: */*
>
* Request completely sent off
... hang here ...
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

bug: lmcache remote server consistently disconnected from client side #723

Describe the bug

Server side:

Client side:

To Reproduce

Expected behavior

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

bug: lmcache remote server consistently disconnected from client side #723

Description

Describe the bug

Server side:

Client side:

To Reproduce

Expected behavior

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions