Feature/statistics by qchapp · Pull Request #42 · EPFLiGHT/MMIRAGE

qchapp · 2026-05-06T10:58:55Z

This pull request introduces a new benchmarking feature for MMIRAGE that enables detailed per-shard performance tracking, including GPU utilization and throughput metrics. The changes add a --stats flag to the CLI for both local and SLURM runs, update documentation, and provide a DataTrove-compatible benchmark configuration. The most important changes are grouped below:

Benchmarking and Performance Tracking:

Added support for per-shard benchmarking via a new --stats flag to the run and submit commands, which enables GPU utilization polling and throughput tracking during shard execution. This is controlled via the MMIRAGE_COLLECT_STATS environment variable. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
Introduced a new stats CLI command to print per-shard and aggregate benchmark statistics in JSON format, using a new collect_bench_stats utility. [1] [2] [3] [4]

Documentation Updates:

Expanded the README.md with a new section on benchmarking shard performance, including example commands, sample output, and explanations of key metrics.
Added a reference to the DataTrove benchmark in the README.md.

Configuration and Compatibility:

Added a new configuration file configs/config_benchmark_datatrove.yaml for running a DataTrove-compatible throughput benchmark, with detailed instructions and settings matching the DataTrove inference benchmark.

These changes provide users with tools to collect, inspect, and compare detailed runtime and hardware utilization statistics, facilitating performance analysis and benchmarking against industry standards like DataTrove.

Copilot

Pull request overview

This PR adds an opt-in benchmarking/stats pipeline to MMIRAGE to record per-shard runtime, throughput, token counts, and GPU utilization, and exposes the results via CLI and documentation.

Changes:

Add --stats flag to local run and SLURM submit (and retry flows) to enable stats collection via MMIRAGE_COLLECT_STATS.
Record per-shard stats into shard status.json (runtime/throughput, GPU util polling, token counts, model load time) and add a new mmirage stats command to report per-shard + aggregate JSON.
Add documentation and a DataTrove-compatible benchmark config (configs/config_benchmark_datatrove.yaml).

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/mmirage/shard_utils.py	Introduces `ShardStats`, duration formatting, and a background `nvidia-smi` poller; persists stats into shard status payloads.
src/mmirage/shard_process.py	Enables opt-in GPU polling + token/load-time capture and writes stats on shard success.
src/mmirage/core/process/processors/llm/llm_processor.py	Tracks cumulative token counts and measures engine init time; supports forwarding extra engine kwargs.
src/mmirage/core/process/processors/llm/config.py	Adds `extra_engine_args` to allow passing additional SGLang Engine kwargs from YAML.
src/mmirage/core/process/mapper.py	Aggregates token counts and model load time across processors for shard-level stats.
src/mmirage/cli.py	Adds `--stats` to relevant commands and introduces a `stats` subcommand emitting JSON.
src/mmirage/cli_utils/status.py	Adds `collect_bench_stats()` to aggregate shard stats across runs; wires `--stats` into retry submission.
src/mmirage/cli_utils/slurm.py	Plumbs `collect_stats` through sbatch generation to export `MMIRAGE_COLLECT_STATS=1`.
README.md	Documents benchmarking workflow, metrics, and reference benchmark links.
configs/config_benchmark_datatrove.yaml	Adds a DataTrove-compatible throughput benchmark configuration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Copilot

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

qchapp added 12 commits April 30, 2026 18:04

trying a benchmark

968e729

fixed stats

7da36b4

small test

3e1a9bc

testing something

a1386f7

slurm config

5181248

display issue with multiple nodes

4af933d

small test again

5398ab2

trying again

aafb230

excluding cold start

79ca2fe

now same for gpu

3ba5c97

small corrections

19d7d8a

ready for PR

0a46d7d

Copilot AI review requested due to automatic review settings May 6, 2026 10:58

qchapp linked an issue May 6, 2026 that may be closed by this pull request

Log statistics #40

Closed

qchapp temporarily deployed to docker May 6, 2026 10:59 — with GitHub Actions Inactive

Copilot started reviewing on behalf of qchapp May 6, 2026 10:59 View session

Copilot AI reviewed May 6, 2026

View reviewed changes

Comment thread src/mmirage/shard_process.py Outdated

Comment thread src/mmirage/shard_process.py

Comment thread src/mmirage/cli_utils/status.py

Comment thread README.md Outdated

Potential fix for pull request finding

716f7b3

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

qchapp temporarily deployed to docker May 6, 2026 11:04 — with GitHub Actions Inactive

Potential fix for pull request finding

3218285

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

qchapp temporarily deployed to docker May 6, 2026 11:04 — with GitHub Actions Inactive

copilot suggestions

1f0b9e0

qchapp temporarily deployed to docker May 6, 2026 11:13 — with GitHub Actions Inactive

qchapp requested a review from Copilot May 6, 2026 11:14

Copilot started reviewing on behalf of qchapp May 6, 2026 11:15 View session

Copilot AI reviewed May 6, 2026

View reviewed changes

Comment thread src/mmirage/shard_utils.py Outdated

Comment thread README.md

Comment thread README.md Outdated

Comment thread src/mmirage/shard_utils.py

qchapp temporarily deployed to docker May 6, 2026 12:20 — with GitHub Actions Inactive

function deduplication

dc1c966

qchapp temporarily deployed to docker May 7, 2026 09:59 — with GitHub Actions Inactive

qchapp requested a review from Copilot May 7, 2026 10:00

Copilot started reviewing on behalf of qchapp May 7, 2026 10:00 View session

Copilot AI reviewed May 7, 2026

View reviewed changes

Comment thread src/mmirage/core/process/mapper.py Outdated

Comment thread src/mmirage/shard_process.py

Comment thread configs/config_benchmark_datatrove.yaml Outdated

Potential fix for pull request finding

1ef07d4

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

qchapp temporarily deployed to docker May 7, 2026 11:11 — with GitHub Actions Inactive

copilot changes

dfa7fae

qchapp temporarily deployed to docker May 7, 2026 11:21 — with GitHub Actions Inactive

qchapp requested a review from fabnemEPFL May 7, 2026 11:21

fabnemEPFL requested changes May 8, 2026

View reviewed changes

Comment thread src/mmirage/core/process/mapper.py Outdated

Comment thread configs/config_benchmark_datatrove.yaml Outdated

Comment thread src/mmirage/shard_process.py Outdated

Comment thread src/mmirage/shard_utils.py Outdated

implemented changes requested by fabrice

7946278

qchapp temporarily deployed to docker May 8, 2026 15:18 — with GitHub Actions Inactive

fabnemEPFL added 2 commits May 11, 2026 11:16

fixed TokenCounts logic

c962618

fixed various typing and logic errors

fdb0d42

fabnemEPFL temporarily deployed to docker May 11, 2026 09:26 — with GitHub Actions Inactive

fabnemEPFL requested changes May 11, 2026

View reviewed changes

Comment thread src/mmirage/shard_process.py Outdated

Comment thread src/mmirage/shard_utils.py Outdated

fixed image base path

37ca65a

fabnemEPFL temporarily deployed to docker May 11, 2026 13:16 — with GitHub Actions Inactive

fabnemEPFL approved these changes May 11, 2026

View reviewed changes

fabnemEPFL merged commit 54923f1 into main May 11, 2026
2 checks passed

fabnemEPFL deleted the feature/statistics branch May 11, 2026 13:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/statistics#42

Feature/statistics#42
fabnemEPFL merged 24 commits into
mainfrom
feature/statistics

qchapp commented May 6, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

qchapp commented May 6, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants