Skip to content

Conversation

@ishandhanani
Copy link

@ishandhanani ishandhanani commented Nov 1, 2025

Motivation

When I run vllm bencher against dynamo ToT I get

File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/sgl-workspace/dynamo/components/backends/sglang/slurm_jobs/scripts/vllm/benchmark_serving.py", line 645, in benchmark
    raise ValueError(
ValueError: Initial test run failed - Please make sure benchmark arguments are correctly specified. Error: Traceback (most recent call last):
  File "/sgl-workspace/dynamo/components/backends/sglang/slurm_jobs/scripts/vllm/backend_request_func.py", line 386, in async_request_dynamo_completions
    data = json.loads(chunk)
           ^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/json/decoder.py", line 338, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/json/decoder.py", line 356, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Did a bit of digging and realized this happens due to an llm_metrics event that we emit

...
chunk_bytes=b'data: {"id":"cmpl-2c202c64-6711-43b0-9700-a979896c7484","choices":[],"created":1761605383,"model":"deepseek-ai/DeepSeek-R1","system_fingerprint":null,"object":"text_completion","usage":{"prompt_tokens":991,"completion_tokens":900,"total_tokens":1891}}'
chunk_bytes=b'event: llm_metrics'
Traceback (most recent call last):
  File "/sgl-workspace/dynamo/components/backends/sglang/slurm_jobs/scripts/vllm/benchmark_serving.py", line 1426, in <module>
    main(args)
  File "/sgl-workspace/dynamo/components/backends/sglang/slurm_jobs/scripts/vllm/benchmark_serving.py", line 1041, in main
    benchmark_result = asyncio.run(
                       ^^^^^^^^^^^^
  File "/usr/lib/python3.12/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^

This properly parses it out and fixes it

@IzzyPutterman
Copy link
Collaborator

Looks good to me, I'll let Kedar merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants