Issues with inf generation throughput and 0 ttft

# Summary

I've been working on a project involving lmcache tuning, and we have been using this benchmark as part of our evaluation suite.
The metric we are focusing on is "time to first token" (TTFT), and we have been noticing that many of the benchmarks are returning a time of **0** for this metric, which doesn't seem correct.

Additionally, we frequently see average throughput at `inf`

Is this working as designed? Could some guidance be offered how to better interpret these results?


# Details

Command invoked was `./run_benchmarks.sh "meta-llama/Llama-3.1-70B-Instruct" http://khanhtest-vllm-fs-lmcache.ibm-cas-red-stack.svc.cluster.local:8000 /tmp/benchmark/lmcache_off  all 1.34 2.0 3.0`


```
/tmp/benchmark/lmcache_off_long_input_output_1.34.csv
[2025-06-16 11:44:01,006] WARNING: Processing the existing summary file /tmp/benchmark/lmcache_off_long_input_output_1.34.csv, ignoring all the other arguments (multi-round-qa.py:722:__main__)
[2025-06-16 11:44:01,008] INFO: Calculating performance summary (multi-round-qa.py:568:__main__)


==================== Performance summary ======================
  QPS: 0.0000 reqs/s

  Processing speed: 1.3941 reqs/s

  Requests on-the-fly: 0

  Input tokens per second: 29451.9382 tokens/s

  Output tokens per second: 1.3941 tokens/s

  Average generation throughput (per request): inf tokens/req/s
                                               ^^^^^^^^ inf??

  Average TTFT: 0.0000s
                ^^^^^^^^^~ This takes no time at all?

Time range: 1750097945.6405346 - 1750098046.7831142 (101.14s)
===============================================================

```

[lmcache_off_long_input_output_1.34.csv](https://github.com/user-attachments/files/20763032/lmcache_off_long_input_output_1.34.csv)

Branch commit was `95b2939136ff003e2e4c67277ec82bf43ff8be34` on main.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Issues with inf generation throughput and 0 ttft #27

Summary

Details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issues with inf generation throughput and 0 ttft #27

Description

Summary

Details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions