Skip to content

Feature/statistics#42

Merged
fabnemEPFL merged 24 commits into
mainfrom
feature/statistics
May 11, 2026
Merged

Feature/statistics#42
fabnemEPFL merged 24 commits into
mainfrom
feature/statistics

Conversation

@qchapp
Copy link
Copy Markdown
Member

@qchapp qchapp commented May 6, 2026

This pull request introduces a new benchmarking feature for MMIRAGE that enables detailed per-shard performance tracking, including GPU utilization and throughput metrics. The changes add a --stats flag to the CLI for both local and SLURM runs, update documentation, and provide a DataTrove-compatible benchmark configuration. The most important changes are grouped below:

Benchmarking and Performance Tracking:

  • Added support for per-shard benchmarking via a new --stats flag to the run and submit commands, which enables GPU utilization polling and throughput tracking during shard execution. This is controlled via the MMIRAGE_COLLECT_STATS environment variable. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
  • Introduced a new stats CLI command to print per-shard and aggregate benchmark statistics in JSON format, using a new collect_bench_stats utility. [1] [2] [3] [4]

Documentation Updates:

  • Expanded the README.md with a new section on benchmarking shard performance, including example commands, sample output, and explanations of key metrics.
  • Added a reference to the DataTrove benchmark in the README.md.

Configuration and Compatibility:

  • Added a new configuration file configs/config_benchmark_datatrove.yaml for running a DataTrove-compatible throughput benchmark, with detailed instructions and settings matching the DataTrove inference benchmark.

These changes provide users with tools to collect, inspect, and compare detailed runtime and hardware utilization statistics, facilitating performance analysis and benchmarking against industry standards like DataTrove.

Copilot AI review requested due to automatic review settings May 6, 2026 10:58
@qchapp qchapp linked an issue May 6, 2026 that may be closed by this pull request
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an opt-in benchmarking/stats pipeline to MMIRAGE to record per-shard runtime, throughput, token counts, and GPU utilization, and exposes the results via CLI and documentation.

Changes:

  • Add --stats flag to local run and SLURM submit (and retry flows) to enable stats collection via MMIRAGE_COLLECT_STATS.
  • Record per-shard stats into shard status.json (runtime/throughput, GPU util polling, token counts, model load time) and add a new mmirage stats command to report per-shard + aggregate JSON.
  • Add documentation and a DataTrove-compatible benchmark config (configs/config_benchmark_datatrove.yaml).

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/mmirage/shard_utils.py Introduces ShardStats, duration formatting, and a background nvidia-smi poller; persists stats into shard status payloads.
src/mmirage/shard_process.py Enables opt-in GPU polling + token/load-time capture and writes stats on shard success.
src/mmirage/core/process/processors/llm/llm_processor.py Tracks cumulative token counts and measures engine init time; supports forwarding extra engine kwargs.
src/mmirage/core/process/processors/llm/config.py Adds extra_engine_args to allow passing additional SGLang Engine kwargs from YAML.
src/mmirage/core/process/mapper.py Aggregates token counts and model load time across processors for shard-level stats.
src/mmirage/cli.py Adds --stats to relevant commands and introduces a stats subcommand emitting JSON.
src/mmirage/cli_utils/status.py Adds collect_bench_stats() to aggregate shard stats across runs; wires --stats into retry submission.
src/mmirage/cli_utils/slurm.py Plumbs collect_stats through sbatch generation to export MMIRAGE_COLLECT_STATS=1.
README.md Documents benchmarking workflow, metrics, and reference benchmark links.
configs/config_benchmark_datatrove.yaml Adds a DataTrove-compatible throughput benchmark configuration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/mmirage/shard_process.py Outdated
Comment thread src/mmirage/shard_process.py
Comment thread src/mmirage/cli_utils/status.py
Comment thread README.md Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.

Comment thread src/mmirage/shard_utils.py Outdated
Comment thread README.md
Comment thread README.md Outdated
Comment thread src/mmirage/shard_utils.py
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.

Comment thread src/mmirage/core/process/mapper.py Outdated
Comment thread src/mmirage/shard_process.py
Comment thread configs/config_benchmark_datatrove.yaml Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Comment thread src/mmirage/core/process/mapper.py Outdated
Comment thread configs/config_benchmark_datatrove.yaml Outdated
Comment thread src/mmirage/shard_process.py Outdated
Comment thread src/mmirage/shard_utils.py Outdated
Comment thread src/mmirage/shard_process.py Outdated
Comment thread src/mmirage/shard_utils.py Outdated
@fabnemEPFL fabnemEPFL merged commit 54923f1 into main May 11, 2026
2 checks passed
@fabnemEPFL fabnemEPFL deleted the feature/statistics branch May 11, 2026 13:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Log statistics

3 participants