You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
-1Lines changed: 0 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -63,7 +63,6 @@ Features
63
63
|**[Template Endpoint](docs/tutorials/template-endpoint.md)**| Benchmark custom APIs with flexible Jinja2 request templates | Custom API formats, rapid prototyping, non-standard endpoints |
64
64
|**[SGLang Image Generation](docs/tutorials/sglang-image-generation.md)**| Benchmark image generation APIs using SGLang with FLUX.1-dev model | Image generation testing, text-to-image benchmarking, extracting generated images |
65
65
|**[Visualization & Plotting](docs/tutorials/plot.md)**| Generate PNG visualizations with automatic mode detection (single-run analysis or multi-run comparison) | Parameter sweep analysis, performance debugging, model comparison |
66
-
|
67
66
68
67
### Working with Benchmark Data
69
68
-**[Profile Exports](docs/tutorials/working-with-profile-exports.md)** - Parse and analyze `profile_export.jsonl` with Pydantic models, custom metrics, and async processing
@@ -330,8 +330,10 @@ The dashboard automatically detects visualization mode (multi-run comparison or
330
330
- Token Throughput per GPU vs Interactivity
331
331
332
332
**Single-run plots** (time series):
333
+
```text
333
334
- GPU Utilization Over Time
334
335
- GPU Memory Usage Over Time
336
+
```
335
337
336
338

337
339
@@ -345,10 +347,12 @@ The dashboard automatically detects visualization mode (multi-run comparison or
345
347
When timeslice data is available (via `--slice-duration` during profiling), plots show performance evolution across time windows.
346
348
347
349
**Generated timeslice plots:**
350
+
```text
348
351
- TTFT Across Timeslices
349
352
- ITL Across Timeslices
350
353
- Throughput Across Timeslices
351
354
- Latency Across Timeslices
355
+
```
352
356
353
357
**Timeslices enable easy outlier identification and bucketing analysis**. Each time window (bucket) shows avg/p50/p95 statistics, making it simple to spot which periods have outlier performance. Slice 0 often shows cold-start overhead, while later slices may reveal degradation. Flat bars across slices may indicate stable performance; increasing trends can suggest resource exhaustion. Potentially useful for quickly isolating performance issues to specific phases (warmup, steady-state, or degradation).
0 commit comments