Skip to content

[Feature] Opt metrics structure#891

Open
LJH-LBJ wants to merge 138 commits intovllm-project:mainfrom
LJH-LBJ:opt_metrics_structure
Open

[Feature] Opt metrics structure#891
LJH-LBJ wants to merge 138 commits intovllm-project:mainfrom
LJH-LBJ:opt_metrics_structure

Conversation

@LJH-LBJ
Copy link
Contributor

@LJH-LBJ LJH-LBJ commented Jan 22, 2026

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.

Purpose

Resolves: #533
make metrics more clear and opt metrics's format
image

design doc:
https://docs.google.com/document/d/1St1tHMyp1kPwbYzHUFJYQHBGoWcQJA_dcGb9pemUZGI/edit?tab=t.0

Test Plan

Test 1
Omni online inference

vllm serve /workspace/models/Qwen3-Omni-30B-A3B-Instruct --omni --port 8014 --log-stats
python openai_chat_completion_client_for_multimodal_generation.py \
  --query-type use_video \
  --video-path t2v_out_1.mp4 \
  --model /workspace/models/Qwen3-Omni-30B-A3B-Instruct \
  --prompt "What are the main activities shown in this video?" 

Test 2
Omni offline inference
need to add --log-stats in run_multiple_prompts.sh

python end2end.py --output-wav output_audio \
                  --query-type text \
                  --txt-prompts text_prompts_10.txt \
                  --py-generator \
                  --log-stats
cd examples/offline_inference/qwen3_omni
bash run_multiple_prompts.sh

Test 3

vllm serve /workspace/models/Qwen-Image --omni --port 8014 --log-stats

curl -s http://localhost:8014/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "a cup of coffee on the table"}
    ],
    "extra_body": {
      "height": 1024,
      "width": 1024,
      "num_inference_steps": 50,
      "guidance_scale": 4.0,
      "seed": 42
    }
  }' \
  | jq -r '.choices[0].message.content[0].image_url.url' \
  | cut -d',' -f2 | base64 -d > coffee.png

Test Result

Test result 1

(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:439] [Overall Summary]
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:439] +-----------------------------+------------+
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:439] | Field                       |      Value |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:439] +-----------------------------+------------+
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:439] | e2e_requests                |          1 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:439] | e2e_wall_time_ms            | 40,828.324 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:439] | e2e_total_tokens            |      5,105 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:439] | e2e_avg_time_per_request_ms | 40,828.324 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:439] | e2e_avg_tokens_per_s        |    125.036 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:439] | e2e_stage_0_wall_time_ms    | 10,659.139 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:439] | e2e_stage_1_wall_time_ms    | 24,827.949 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:439] | e2e_stage_2_wall_time_ms    |    625.227 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:439] +-----------------------------+------------+
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:465] 
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:465] [RequestE2EStats [request_id=chatcmpl-bd653d4b6bcdc00e]]
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:465] +-------------------------+-------------+
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:465] | Field                   |       Value |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:465] +-------------------------+-------------+
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:465] | e2e_total_ms            |  40,827.682 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:465] | e2e_total_tokens        |       5,105 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:465] | transfers_total_kbytes  | 137,606.358 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:465] | transfers_total_time_ms |     349.074 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:465] +-------------------------+-------------+
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:518] 
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:518] [StageRequestStats [request_id=chatcmpl-bd653d4b6bcdc00e]]
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:518] +------------------------+-----------+-----------+---------+
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:518] | Field                  |         0 |         1 |       2 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:518] +------------------------+-----------+-----------+---------+
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:518] | audio_generated_frames |         0 |         0 | 362,325 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:518] | batch_id               |        53 |       189 |       0 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:518] | batch_size             |         1 |         1 |       1 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:518] | num_tokens_in          |     4,860 |     4,826 |   3,024 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:518] | num_tokens_out         |        55 |       190 |       0 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:518] | postprocess_time_ms    | 4,523.629 |     0.533 |   0.000 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:518] | stage_gen_time_ms      |   120.209 | 1,010.551 | 582.322 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:518] +------------------------+-----------+-----------+---------+
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:558] 
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:558] [TransferEdgeStats [request_id=chatcmpl-bd653d4b6bcdc00e]]
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:558] +-------------------+-------------+------------+
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:558] | Field             |        0->1 |       1->2 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:558] +-------------------+-------------+------------+
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:558] | in_flight_time_ms |       2.096 |      2.588 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:558] | rx_decode_time_ms |     125.193 |     30.728 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:558] | size_kbytes       | 108,797.315 | 28,809.043 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:558] | tx_time_ms        |     158.411 |     30.057 |
(APIServer pid=656024) INFO 02-06 15:01:49 [stats.py:558] +-------------------+-------------+------------+

Test 2 result

INFO 02-06 15:29:11 [stats.py:454] [Overall Summary]
INFO 02-06 15:29:11 [stats.py:454] +-----------------------------+------------+
INFO 02-06 15:29:11 [stats.py:454] | Field                       |      Value |
INFO 02-06 15:29:11 [stats.py:454] +-----------------------------+------------+
INFO 02-06 15:29:11 [stats.py:454] | e2e_requests                |         10 |
INFO 02-06 15:29:11 [stats.py:454] | e2e_wall_time_ms            | 81,430.702 |
INFO 02-06 15:29:11 [stats.py:454] | e2e_total_tokens            |      3,347 |
INFO 02-06 15:29:11 [stats.py:454] | e2e_avg_time_per_request_ms |  8,143.070 |
INFO 02-06 15:29:11 [stats.py:454] | e2e_avg_tokens_per_s        |     41.102 |
INFO 02-06 15:29:11 [stats.py:454] | e2e_stage_0_wall_time_ms    | 19,442.824 |
INFO 02-06 15:29:11 [stats.py:454] | e2e_stage_1_wall_time_ms    | 59,771.091 |
INFO 02-06 15:29:11 [stats.py:454] | e2e_stage_2_wall_time_ms    |  3,015.388 |
INFO 02-06 15:29:11 [stats.py:454] +-----------------------------+------------+
INFO 02-06 15:29:11 [stats.py:480] 
INFO 02-06 15:29:11 [stats.py:480] [RequestE2EStats [request_id=0_72e5beab-aa6d-447d-9727-a3ca66667ac0]]
INFO 02-06 15:29:11 [stats.py:480] +-------------------------+------------+
INFO 02-06 15:29:11 [stats.py:480] | Field                   |      Value |
INFO 02-06 15:29:11 [stats.py:480] +-------------------------+------------+
INFO 02-06 15:29:11 [stats.py:480] | e2e_total_ms            | 78,705.673 |
INFO 02-06 15:29:11 [stats.py:480] | e2e_total_tokens        |         89 |
INFO 02-06 15:29:11 [stats.py:480] | transfers_total_kbytes  |  1,187.154 |
INFO 02-06 15:29:11 [stats.py:480] | transfers_total_time_ms |     10.231 |
INFO 02-06 15:29:11 [stats.py:480] +-------------------------+------------+
INFO 02-06 15:29:11 [stats.py:533] 
INFO 02-06 15:29:11 [stats.py:533] [StageRequestStats [request_id=0_72e5beab-aa6d-447d-9727-a3ca66667ac0]]
INFO 02-06 15:29:11 [stats.py:533] +---------------------+-----------+------------+---------+
INFO 02-06 15:29:11 [stats.py:533] | Field               |         0 |          1 |       2 |
INFO 02-06 15:29:11 [stats.py:533] +---------------------+-----------+------------+---------+
INFO 02-06 15:29:11 [stats.py:533] | batch_id            |         1 |          1 |       1 |
INFO 02-06 15:29:11 [stats.py:533] | batch_size          |        10 |         10 |       1 |
INFO 02-06 15:29:11 [stats.py:533] | num_tokens_in       |        55 |         21 |     400 |
INFO 02-06 15:29:11 [stats.py:533] | num_tokens_out      |         8 |         26 |       0 |
INFO 02-06 15:29:11 [stats.py:533] | postprocess_time_ms | 1,451.138 |      0.481 |   0.000 |
INFO 02-06 15:29:11 [stats.py:533] | stage_gen_time_ms   | 7,121.535 | 49,647.185 | 285.141 |
INFO 02-06 15:29:11 [stats.py:533] +---------------------+-----------+------------+---------+
INFO 02-06 15:29:11 [stats.py:573] 
INFO 02-06 15:29:11 [stats.py:573] [TransferEdgeStats [request_id=0_72e5beab-aa6d-447d-9727-a3ca66667ac0]]
INFO 02-06 15:29:11 [stats.py:573] +-------------------+-----------+-------+
INFO 02-06 15:29:11 [stats.py:573] | Field             |      0->1 |  1->2 |
INFO 02-06 15:29:11 [stats.py:573] +-------------------+-----------+-------+
INFO 02-06 15:29:11 [stats.py:573] | in_flight_time_ms |     1.047 | 1.672 |
INFO 02-06 15:29:11 [stats.py:573] | rx_decode_time_ms |     2.749 | 1.676 |
INFO 02-06 15:29:11 [stats.py:573] | size_kbytes       | 1,185.429 | 1.726 |
INFO 02-06 15:29:11 [stats.py:573] | tx_time_ms        |     2.280 | 0.806 |
INFO 02-06 15:29:11 [stats.py:573] +-------------------+-----------+-------+
INFO 02-06 15:29:11 [stats.py:480] 
INFO 02-06 15:29:11 [stats.py:480] [RequestE2EStats [request_id=1_0184b448-9ab3-40a5-85a7-06b2aa1ffcfe]]
INFO 02-06 15:29:11 [stats.py:480] +-------------------------+------------+
INFO 02-06 15:29:11 [stats.py:480] | Field                   |      Value |
INFO 02-06 15:29:11 [stats.py:480] +-------------------------+------------+
INFO 02-06 15:29:11 [stats.py:480] | e2e_total_ms            | 79,877.905 |
INFO 02-06 15:29:11 [stats.py:480] | e2e_total_tokens        |        434 |
INFO 02-06 15:29:11 [stats.py:480] | transfers_total_kbytes  |  4,630.604 |
INFO 02-06 15:29:11 [stats.py:480] | transfers_total_time_ms |    298.846 |
INFO 02-06 15:29:11 [stats.py:480] +-------------------------+------------+
INFO 02-06 15:29:11 [stats.py:533] 
INFO 02-06 15:29:11 [stats.py:533] [StageRequestStats [request_id=1_0184b448-9ab3-40a5-85a7-06b2aa1ffcfe]]
INFO 02-06 15:29:11 [stats.py:533] +---------------------+-----------+------------+-----------+
INFO 02-06 15:29:11 [stats.py:533] | Field               |         0 |          1 |         2 |
INFO 02-06 15:29:11 [stats.py:533] +---------------------+-----------+------------+-----------+
INFO 02-06 15:29:11 [stats.py:533] | batch_id            |         1 |          1 |         2 |
INFO 02-06 15:29:11 [stats.py:533] | batch_size          |        10 |         10 |         1 |
INFO 02-06 15:29:11 [stats.py:533] | num_tokens_in       |        57 |         23 |     4,528 |
INFO 02-06 15:29:11 [stats.py:533] | num_tokens_out      |        93 |        284 |         0 |
INFO 02-06 15:29:11 [stats.py:533] | postprocess_time_ms |     8.152 |      0.377 |     0.000 |
INFO 02-06 15:29:11 [stats.py:533] | stage_gen_time_ms   | 7,121.535 | 49,647.185 | 1,161.031 |
INFO 02-06 15:29:11 [stats.py:533] +---------------------+-----------+------------+-----------+
INFO 02-06 15:29:11 [stats.py:573] 
INFO 02-06 15:29:11 [stats.py:573] [TransferEdgeStats [request_id=1_0184b448-9ab3-40a5-85a7-06b2aa1ffcfe]]
INFO 02-06 15:29:11 [stats.py:573] +-------------------+-----------+---------+
INFO 02-06 15:29:11 [stats.py:573] | Field             |      0->1 |    1->2 |
INFO 02-06 15:29:11 [stats.py:573] +-------------------+-----------+---------+
INFO 02-06 15:29:11 [stats.py:573] | in_flight_time_ms |     0.000 | 285.377 |
INFO 02-06 15:29:11 [stats.py:573] | rx_decode_time_ms |     3.297 |   3.658 |
INFO 02-06 15:29:11 [stats.py:573] | size_kbytes       | 4,617.656 |  12.948 |
INFO 02-06 15:29:11 [stats.py:573] | tx_time_ms        |     6.226 |   0.288 |
INFO 02-06 15:29:11 [stats.py:573] +-------------------+-----------+---------+
...

Test result 3

(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:454] [Overall Summary]
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:454] +-----------------------------+------------+
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:454] | Field                       |      Value |
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:454] +-----------------------------+------------+
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:454] | e2e_requests                |          1 |
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:454] | e2e_wall_time_ms            | 19,773.057 |
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:454] | e2e_avg_time_per_request_ms | 19,773.057 |
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:454] | e2e_stage_0_wall_time_ms    | 19,772.584 |
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:454] +-----------------------------+------------+
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:480] 
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:480] [RequestE2EStats [request_id=chatcmpl-eac6b12cee4f45c4]]
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:480] +--------------+------------+
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:480] | Field        |      Value |
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:480] +--------------+------------+
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:480] | e2e_total_ms | 19,772.583 |
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:480] +--------------+------------+
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:533] 
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:533] [StageRequestStats [request_id=chatcmpl-eac6b12cee4f45c4]]
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:533] +---------------------+------------+
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:533] | Field               |          0 |
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:533] +---------------------+------------+
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:533] | batch_size          |          1 |
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:533] | image_num           |          1 |
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:533] | postprocess_time_ms |      1,726 |
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:533] | resolution          |        640 |
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:533] | stage_gen_time_ms   | 19,742.906 |
(APIServer pid=665125) INFO 02-06 15:34:59 [stats.py:533] +---------------------+------------+
(APIServer pid=665125) INFO 02-06 15:35:00 [serving_chat.py:2086] Diffusion chat completed for request chatcmpl-eac6b12cee4f45c4: 1 images

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft.

BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)

LJH-LBJ and others added 23 commits January 15, 2026 18:25
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 443022b0a4

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 318 to 320
batch_id=metrics.get("batch_id", -1),
batch_size=metrics.get("batch_size"),
stage_gen_time_ms=self.accumulated_gen_time_ms.pop(req_id, 0.0),

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve stage generation time for sync orchestrator metrics

In _as_stage_request_stats, stage_gen_time_ms is always taken from accumulated_gen_time_ms.pop(req_id, 0.0) and the value provided in the metrics dict is ignored. That accumulator is only updated in the async pipeline; the synchronous Omni path never adds to it, so per-request stage timing (and any derived rates) become zero in non-async runs. This is a regression in metrics accuracy for synchronous serving; consider falling back to metrics.get("stage_gen_time_ms") when the accumulator is empty or populating the accumulator in the sync path.

Useful? React with 👍 / 👎.

Comment on lines 426 to 430
# Derive inputs for the next stage, record preprocess time
_prep_t0 = time.perf_counter()
next_inputs = next_stage.process_engine_inputs(self.stage_list, prompt)
_prep_ms = (time.perf_counter() - _prep_t0) * 1000.0
metrics.record_stage_preprocess_time(next_stage_id, req_id, _prep_ms)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid dropping preprocess timing before stage stats exist

The preprocess time is recorded immediately after process_engine_inputs but before the next stage has produced any metrics. record_stage_preprocess_time only updates existing stage_events entries, so at this point there is no entry for next_stage_id, causing the value to be dropped and leaving preprocess_time_ms at 0 for all requests in async multi-stage runs. To make this metric usable, buffer it until on_stage_metrics creates the stage entry or move the recording to after metrics are emitted.

Useful? React with 👍 / 👎.

@hsliuustc0106
Copy link
Collaborator

does this also apply to dit models?

Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
@@ -0,0 +1,156 @@

# Production Metrics:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please take a final check on the md and test the output data again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already check .md file and update the test result

Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
@hsliuustc0106 hsliuustc0106 added the ready label to trigger buildkite CI label Feb 5, 2026
@hsliuustc0106
Copy link
Collaborator

fix CI please

LJH-LBJ and others added 12 commits February 6, 2026 09:04
Signed-off-by: Junhong Liu <ljh_lbj@163.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
| Field | Meaning |
|---------------------------|----------------------------------------------------------------------------------------------|
| `e2e_requests` | Number of completed requests. |
| `e2e_wall_time_ms` | Wall-clock time span from run start to last completion, in ms. |
Copy link
Contributor

@Bounty-hunter Bounty-hunter Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be clarified that this applies to the offline case? For the online case, it actually tracks only individual requests (e2e requests always 1), rather than a summary over the entire online process.

Can understood in this way?
For offline scenarios: we have request-level metrics and system-level metrics
For online scenarios: only request-level metrics are available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes,ok

| `stage_gen_time_ms` | Stage compute time in ms, excluding postprocessing time (reported separately as `postprocess_time_ms`). |
| `image_num` | Number of images generated (for diffusion/image stages). |
| `resolution` | Image resolution (for diffusion/image stages). |
| `postprocess_time_ms` | Diffusion/image: post-processing time in ms. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same with postprocess_time_ms | Postprocessing time in ms. ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove postprocess_time_ms | Postprocessing time in ms

| `size_kbytes` | Total kbytes transferred. |
| `tx_time_ms` | Sender transfer time in ms. |
| `rx_decode_time_ms` | Receiver decode time in ms. |
| `in_flight_time_ms` | In-flight time in ms. |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image

I am confuse about this result. in_flight_times refers to the network transmission time ? and tx_time_ms and rx_decode_time_ms is serialize/deserialize time? seems to take a lot of time.

please check it !

Copy link
Contributor Author

@LJH-LBJ LJH-LBJ Feb 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, about 90% cost by deserialize/serialize and shm_write/shm_read

Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
OVERALL_FIELDS: list[str] | None = None
STAGE_FIELDS = _build_field_defs(StageRequestStats, STAGE_EXCLUDE, FIELD_TRANSFORMS)
TRANSFER_FIELDS = _build_field_defs(TransferEdgeStats, TRANSFER_EXCLUDE, FIELD_TRANSFORMS)
E2E_FIELDS = _build_field_defs(RequestE2EStats, E2E_EXCLUDE, FIELD_TRANSFORMS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Above just put into StageRequestStats/TransferEdgeStats/RequestE2EStats maintenance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put it in vllm_omni\metrics\utils.py, becase this function is not related to XXXStats.

E2E_FIELDS = _build_field_defs(RequestE2EStats, E2E_EXCLUDE, FIELD_TRANSFORMS)


def _get_or_create_transfer_event(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put into OrchestratorAggregator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if self.log_stats:
self.log_request_stats(stats, "stage_stats")
if stats.stage_stats is not None:
self.log_request_stats(stats, "stage_running_avg")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dosen't see any explain abot stage_running_avg.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted, just log in summary. No need log_request_stats any more

tx_time_ms=tx_ms,
used_shm=used_shm,
)
if self.log_stats and evt is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why need self.log_request_stats hear, isn't is log in build_and_log_summary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted

LJH-LBJ and others added 4 commits February 6, 2026 23:43
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
@hsliuustc0106
Copy link
Collaborator

fix comments from @yenuo26

@LJH-LBJ LJH-LBJ requested a review from yenuo26 February 6, 2026 23:33
@LJH-LBJ
Copy link
Contributor Author

LJH-LBJ commented Feb 6, 2026

fix comments from @yenuo26

already fixed. just one comment
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready label to trigger buildkite CI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[RFC]: Optimize the metric.

7 participants