-
Notifications
You must be signed in to change notification settings - Fork 118
Description
Your current environment
TPU Info:
TPU info: node_name=v6e-test | tpu_type=v6e-4 | worker_id=0 | num_chips=4 | num_cores_per_chip=1
Command to start vLLM server:
docker run -it --rm
--net=host
--privileged
-v /dev:/dev
-v $HOME/.cache/huggingface:/root/.cache/huggingface
vllm/vllm-tpu:nightly-20260303-1f005dc-4034c3d
vllm serve Qwen/Qwen3-VL-8B-Instruct
--tensor-parallel-size 4
--dtype bfloat16
--max-model-len 22528
--max-num-seqs 16
--host 0.0.0.0
--port 7000
--trust-remote-code
--enable-log-requests
🐛 Describe the bug
Issue:
(APIServer pid=1) INFO 03-04 01:49:51 [async_llm.py:421] Added request chatcmpl-b61864175ca7091d-a87942ae.
(EngineCore_DP0 pid=564) ERROR 03-04 01:49:51 [core.py:1102] EngineCore encountered a fatal error.
(EngineCore_DP0 pid=564) ERROR 03-04 01:49:51 [core.py:1102] Traceback (most recent call last):
(EngineCore_DP0 pid=564) ERROR 03-04 01:49:51 [core.py:1102] File "/workspace/vllm/vllm/v1/engine/core.py", line 1093, in run_engine_core
(EngineCore_DP0 pid=564) ERROR 03-04 01:49:51 [core.py:1102] engine_core.run_busy_loop()
(EngineCore_DP0 pid=564) ERROR 03-04 01:49:51 [core.py:1102] File "/workspace/vllm/vllm/v1/engine/core.py", line 1128, in run_busy_loop
(EngineCore_DP0 pid=564) ERROR 03-04 01:49:51 [core.py:1102] self._process_engine_step()
(EngineCore_DP0 pid=564) ERROR 03-04 01:49:51 [core.py:1102] File "/workspace/vllm/vllm/v1/engine/core.py", line 1165, in _process_engine_step
(EngineCore_DP0 pid=564) ERROR 03-04 01:49:51 [core.py:1102] outputs, model_executed = self.step_fn()
(EngineCore_DP0 pid=564) ERROR 03-04 01:49:51 [core.py:1102] ^^^^^^^^^^^^^^
(EngineCore_DP0 pid=564) ERROR 03-04 01:49:51 [core.py:1102] File "/workspace/vllm/vllm/v1/engine/core.py", line 507, in step_with_batch_queue
(EngineCore_DP0 pid=564) ERROR 03-04 01:49:51 [core.py:1102] engine_core_outputs = self.scheduler.update_from_output(
(EngineCore_DP0 pid=564) ERROR 03-04 01:49:51 [core.py:1102] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=564) ERROR 03-04 01:49:51 [core.py:1102] File "/workspace/vllm/vllm/v1/core/sched/scheduler.py", line 1316, in update_from_output
(EngineCore_DP0 pid=564) ERROR 03-04 01:49:51 [core.py:1102] req_index = model_runner_output.req_id_to_index[req_id]
(EngineCore_DP0 pid=564) ERROR 03-04 01:49:51 [core.py:1102] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^
(EngineCore_DP0 pid=564) ERROR 03-04 01:49:51 [core.py:1102] KeyError: 'chatcmpl-b61864175ca7091d-a87942ae'
(EngineCore_DP0 pid=564) Process EngineCore_DP0:
(APIServer pid=1) ERROR 03-04 01:49:51 [async_llm.py:708] AsyncLLM output_handler failed.
(APIServer pid=1) ERROR 03-04 01:49:51 [async_llm.py:708] Traceback (most recent call last):
(APIServer pid=1) ERROR 03-04 01:49:51 [async_llm.py:708] File "/workspace/vllm/vllm/v1/engine/async_llm.py", line 664, in output_handler
(APIServer pid=1) ERROR 03-04 01:49:51 [async_llm.py:708] outputs = await engine_core.get_output_async()
(APIServer pid=1) ERROR 03-04 01:49:51 [async_llm.py:708] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=1) ERROR 03-04 01:49:51 [async_llm.py:708] File "/workspace/vllm/vllm/v1/engine/core_client.py", line 1004, in get_output_async
(APIServer pid=1) ERROR 03-04 01:49:51 [async_llm.py:708] raise self._format_exception(outputs) from None
(APIServer pid=1) ERROR 03-04 01:49:51 [async_llm.py:708] vllm.v1.engine.exceptions.EngineDeadError: EngineCore encountered an issue. See stack trace (above) for the root cause.
(EngineCore_DP0 pid=564) Traceback (most recent call last):
(APIServer pid=1) INFO 03-04 01:49:51 [async_llm.py:605] Request chatcmpl-b61864175ca7091d failed (engine dead).
(EngineCore_DP0 pid=564) File "/usr/local/lib/python3.12/multiprocessing/process.py", line 314, in _bootstrap
(EngineCore_DP0 pid=564) self.run()
(EngineCore_DP0 pid=564) File "/usr/local/lib/python3.12/multiprocessing/process.py", line 108, in run
(EngineCore_DP0 pid=564) self._target(*self._args, **self._kwargs)
(EngineCore_DP0 pid=564) File "/workspace/vllm/vllm/v1/engine/core.py", line 1104, in run_engine_core
(EngineCore_DP0 pid=564) raise e
(EngineCore_DP0 pid=564) File "/workspace/vllm/vllm/v1/engine/core.py", line 1093, in run_engine_core
(EngineCore_DP0 pid=564) engine_core.run_busy_loop()
(EngineCore_DP0 pid=564) File "/workspace/vllm/vllm/v1/engine/core.py", line 1128, in run_busy_loop
(EngineCore_DP0 pid=564) self._process_engine_step()
(EngineCore_DP0 pid=564) File "/workspace/vllm/vllm/v1/engine/core.py", line 1165, in _process_engine_step
(EngineCore_DP0 pid=564) outputs, model_executed = self.step_fn()
(EngineCore_DP0 pid=564) ^^^^^^^^^^^^^^
(EngineCore_DP0 pid=564) File "/workspace/vllm/vllm/v1/engine/core.py", line 507, in step_with_batch_queue
(EngineCore_DP0 pid=564) engine_core_outputs = self.scheduler.update_from_output(
(EngineCore_DP0 pid=564) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=564) File "/workspace/vllm/vllm/v1/core/sched/scheduler.py", line 1316, in update_from_output
(EngineCore_DP0 pid=564) req_index = model_runner_output.req_id_to_index[req_id]
(EngineCore_DP0 pid=564) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^
(EngineCore_DP0 pid=564) KeyError: 'chatcmpl-b61864175ca7091d-a87942ae'
(APIServer pid=1) INFO: 2409:40f4:40d6:c21e:acdc:1d47:3335:e212:0 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
(APIServer pid=1) INFO: 2409:40f4:40d6:c21e:acdc:1d47:3335:e212:0 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
(APIServer pid=1) INFO: 2409:40f4:40d6:c21e:acdc:1d47:3335:e212:0 - "POST /v1/chat/completions HTTP/1.1" 500 Internal Server Error
(APIServer pid=1) INFO: Shutting down
(APIServer pid=1) INFO: Waiting for application shutdown.
(APIServer pid=1) INFO: Application shutdown complete.
(APIServer pid=1) INFO: Finished server process [1]
Before submitting a new issue...
- Make sure you already searched for relevant issues and checked the documentation page, which can answer lots of frequently asked questions.