Conversation
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
…lm-omni into opt_metrics_structure
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 443022b0a4
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
vllm_omni/metrics/stats.py
Outdated
| batch_id=metrics.get("batch_id", -1), | ||
| batch_size=metrics.get("batch_size"), | ||
| stage_gen_time_ms=self.accumulated_gen_time_ms.pop(req_id, 0.0), |
There was a problem hiding this comment.
Preserve stage generation time for sync orchestrator metrics
In _as_stage_request_stats, stage_gen_time_ms is always taken from accumulated_gen_time_ms.pop(req_id, 0.0) and the value provided in the metrics dict is ignored. That accumulator is only updated in the async pipeline; the synchronous Omni path never adds to it, so per-request stage timing (and any derived rates) become zero in non-async runs. This is a regression in metrics accuracy for synchronous serving; consider falling back to metrics.get("stage_gen_time_ms") when the accumulator is empty or populating the accumulator in the sync path.
Useful? React with 👍 / 👎.
vllm_omni/entrypoints/async_omni.py
Outdated
| # Derive inputs for the next stage, record preprocess time | ||
| _prep_t0 = time.perf_counter() | ||
| next_inputs = next_stage.process_engine_inputs(self.stage_list, prompt) | ||
| _prep_ms = (time.perf_counter() - _prep_t0) * 1000.0 | ||
| metrics.record_stage_preprocess_time(next_stage_id, req_id, _prep_ms) |
There was a problem hiding this comment.
Avoid dropping preprocess timing before stage stats exist
The preprocess time is recorded immediately after process_engine_inputs but before the next stage has produced any metrics. record_stage_preprocess_time only updates existing stage_events entries, so at this point there is no entry for next_stage_id, causing the value to be dropped and leaving preprocess_time_ms at 0 for all requests in async multi-stage runs. To make this metric usable, buffer it until on_stage_metrics creates the stage entry or move the recording to after metrics are emitted.
Useful? React with 👍 / 👎.
|
does this also apply to dit models? |
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
docs/usage/metrics.md
Outdated
| @@ -0,0 +1,156 @@ | |||
|
|
|||
| # Production Metrics: | |||
There was a problem hiding this comment.
please take a final check on the md and test the output data again.
There was a problem hiding this comment.
Already check .md file and update the test result
…lm-omni into opt_metrics_structure
|
fix CI please |
Signed-off-by: Junhong Liu <ljh_lbj@163.com>
…lm-omni into opt_metrics_structure
…lm-omni into opt_metrics_structure
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
| | Field | Meaning | | ||
| |---------------------------|----------------------------------------------------------------------------------------------| | ||
| | `e2e_requests` | Number of completed requests. | | ||
| | `e2e_wall_time_ms` | Wall-clock time span from run start to last completion, in ms. | |
There was a problem hiding this comment.
It should be clarified that this applies to the offline case? For the online case, it actually tracks only individual requests (e2e requests always 1), rather than a summary over the entire online process.
Can understood in this way?
For offline scenarios: we have request-level metrics and system-level metrics
For online scenarios: only request-level metrics are available.
| | `stage_gen_time_ms` | Stage compute time in ms, excluding postprocessing time (reported separately as `postprocess_time_ms`). | | ||
| | `image_num` | Number of images generated (for diffusion/image stages). | | ||
| | `resolution` | Image resolution (for diffusion/image stages). | | ||
| | `postprocess_time_ms` | Diffusion/image: post-processing time in ms. | |
There was a problem hiding this comment.
same with postprocess_time_ms | Postprocessing time in ms. ?
There was a problem hiding this comment.
remove postprocess_time_ms | Postprocessing time in ms
| | `size_kbytes` | Total kbytes transferred. | | ||
| | `tx_time_ms` | Sender transfer time in ms. | | ||
| | `rx_decode_time_ms` | Receiver decode time in ms. | | ||
| | `in_flight_time_ms` | In-flight time in ms. | |
There was a problem hiding this comment.
Yes, about 90% cost by deserialize/serialize and shm_write/shm_read
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
| OVERALL_FIELDS: list[str] | None = None | ||
| STAGE_FIELDS = _build_field_defs(StageRequestStats, STAGE_EXCLUDE, FIELD_TRANSFORMS) | ||
| TRANSFER_FIELDS = _build_field_defs(TransferEdgeStats, TRANSFER_EXCLUDE, FIELD_TRANSFORMS) | ||
| E2E_FIELDS = _build_field_defs(RequestE2EStats, E2E_EXCLUDE, FIELD_TRANSFORMS) |
There was a problem hiding this comment.
Above just put into StageRequestStats/TransferEdgeStats/RequestE2EStats maintenance?
There was a problem hiding this comment.
I put it in vllm_omni\metrics\utils.py, becase this function is not related to XXXStats.
vllm_omni/metrics/stats.py
Outdated
| E2E_FIELDS = _build_field_defs(RequestE2EStats, E2E_EXCLUDE, FIELD_TRANSFORMS) | ||
|
|
||
|
|
||
| def _get_or_create_transfer_event( |
There was a problem hiding this comment.
put into OrchestratorAggregator?
vllm_omni/metrics/stats.py
Outdated
| if self.log_stats: | ||
| self.log_request_stats(stats, "stage_stats") | ||
| if stats.stage_stats is not None: | ||
| self.log_request_stats(stats, "stage_running_avg") |
There was a problem hiding this comment.
dosen't see any explain abot stage_running_avg.
There was a problem hiding this comment.
deleted, just log in summary. No need log_request_stats any more
vllm_omni/metrics/stats.py
Outdated
| tx_time_ms=tx_ms, | ||
| used_shm=used_shm, | ||
| ) | ||
| if self.log_stats and evt is not None: |
There was a problem hiding this comment.
why need self.log_request_stats hear, isn't is log in build_and_log_summary?
Signed-off-by: Junhong Liu <98734602+LJH-LBJ@users.noreply.github.com>
…lm-omni into opt_metrics_structure
|
fix comments from @yenuo26 |
|


PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS (AT THE BOTTOM) HAVE BEEN CONSIDERED.
Purpose
Resolves: #533

make metrics more clear and opt metrics's format
design doc:
https://docs.google.com/document/d/1St1tHMyp1kPwbYzHUFJYQHBGoWcQJA_dcGb9pemUZGI/edit?tab=t.0
Test Plan
Test 1
Omni online inference
Test 2
Omni offline inference
need to add --log-stats in run_multiple_prompts.sh
Test 3
Test Result
Test result 1
Test 2 result
Test result 3
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.BEFORE SUBMITTING, PLEASE READ https://github.com/vllm-project/vllm-omni/blob/main/CONTRIBUTING.md (anything written below this line will be removed by GitHub Actions)