[Feature] metrics support #3534

CUHKSZzxy · 2025-05-09T13:07:32Z

Objective

Align with vLLM v1 metrics system and beyond. We also refer to SGLang monitoring.

TODO

Change time.perf_counter()
Change API server request arrival timestamp position
Abstract output processing outside of async engine generate()
Expert information collections
Grafana visualization

Usage

Start the server with --enable-metrics

lmdeploy serve api_server models--Qwen--Qwen2.5-7B-Instruct --enable-metrics

Metrics Publishing - Logging
Information will be printed on the terminal every 10 seconds
Metrics Publishing - Prometheus & Grafana (WIP)
Open http://xxxx:23333/metrics/ to view Prometheus details.

Related Issues & PR

Issue 2638, Issue 2673, PR1423

Conflicts: lmdeploy/messages.py lmdeploy/pytorch/engine/engine.py lmdeploy/pytorch/engine/engine_instance.py lmdeploy/pytorch/messages.py lmdeploy/pytorch/paging/scheduler.py

Conflicts: lmdeploy/serve/openai/api_server.py

CUHKSZzxy added 2 commits May 9, 2025 20:38

metrics support prototype

f8b4000

Merge branch 'main' into metrics-support

3e4fca9

Conflicts: lmdeploy/messages.py lmdeploy/pytorch/engine/engine.py lmdeploy/pytorch/engine/engine_instance.py lmdeploy/pytorch/messages.py lmdeploy/pytorch/paging/scheduler.py

CUHKSZzxy added the WIP label May 9, 2025

CUHKSZzxy added 8 commits May 12, 2025 18:01

Merge branch 'main' into metrics-support

02c46ec

Conflicts: lmdeploy/serve/openai/api_server.py

fix wrong conflict resolve

9ae6a1b

add GPU KV cache usage

7904d3a

independent logger for each DP

4a339c8

fix gpu cache usage

8c3ede1

Merge branch 'main' into metrics-support

ddeec2e

rename log stats

9229aa1

fix

862a708

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] metrics support #3534

[Feature] metrics support #3534

CUHKSZzxy commented May 9, 2025 •

edited

Loading

[Feature] metrics support #3534

Are you sure you want to change the base?

[Feature] metrics support #3534

Conversation

CUHKSZzxy commented May 9, 2025 • edited Loading

Objective

TODO

Usage

Related Issues & PR

CUHKSZzxy commented May 9, 2025 •

edited

Loading