[pod/qwen-qwe-60b1c5b9-uct-2507-gaie-epp-6bdf64bf7c-kbs9v/epp] {"level":"error","ts":"2026-04-10T01:31:41Z","caller":"tokenization/pool.go:118","msg":"Dropping tokenization task after max retries","prompt":"","retries":3,"error":"gRPC RenderChatCompletion request failed: rpc error: code = Internal desc = Render failed: \"auto\" tool choice requires --enable-auto-tool-choice and --tool-call-parser to be set","stacktrace":"github.com/llm-d/llm-d-kv-cache/pkg/tokenization.(*Pool).workerLoop\n\t/go/pkg/mod/github.com/llm-d/llm-d-kv-cache@v0.7.1/pkg/tokenization/pool.go:118"}
EPP logs show the following error:
The vLLM serve command includes
--enable-auto-tool-choiceand--tool-call-parser qwen3_xmlScenario file: https://gist.github.com/shashwatj07/16cdda06e4e623587dbb046fe30728c2