This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
MLPerf Storage Benchmark Suite (v2.0.0b1) - a Python framework for benchmarking storage systems supporting ML workloads. The suite uses DLIO (Deep Learning I/O) benchmark as its execution engine.
# Install for development
pip install -e .
# Install with test dependencies
pip install -e ".[test]"
# Install with full DLIO support for running benchmarks
pip install -e ".[full]"
# Run all unit tests
pytest tests/unit -v
# Run a single test file
pytest tests/unit/test_cli.py -v
# Run tests with coverage
pytest tests/unit -v --cov=mlpstorage --cov-report=xml
# Run integration tests
pytest tests/integration -vThe main entry point is mlpstorage with nested subcommands:
# Training benchmarks (unet3d, resnet50, cosmoflow)
mlpstorage training datasize ... # Calculate required dataset size
mlpstorage training datagen ... # Generate synthetic data
mlpstorage training run ... # Execute benchmark
mlpstorage training configview ... # View final configuration
# Checkpointing benchmarks (llama3-8b, llama3-70b, llama3-405b, llama3-1t)
mlpstorage checkpointing run ...
mlpstorage checkpointing datagen ...
mlpstorage checkpointing validate ...
# Other benchmarks
mlpstorage vectordb run ... # Vector database (PREVIEW)
mlpstorage kvcache run ... # KV cache
# Utilities
mlpstorage reports reportgen ... # Generate submission reports
mlpstorage history list/replay ... # Command historyAll benchmarks inherit from Benchmark base class (mlpstorage/benchmarks/base.py):
- Subclasses implement
_run()method and setBENCHMARK_TYPEclass attribute - Base class handles cluster info collection, result directories, metadata, and signal handling
- Supports dependency injection for cluster collectors and validators (for testing)
Concrete implementations in mlpstorage/benchmarks/:
TrainingBenchmark,CheckpointingBenchmark- DLIO-based benchmarksVectorDBBenchmark- Vector database operationsKVCacheBenchmark- LLM KV cache management
BenchmarkRegistry (mlpstorage/registry.py) dynamically registers benchmarks at import time. Each benchmark registration includes its CLI argument builder, enabling automatic CLI construction.
- CLI arguments parsed via
cli_parser.py - YAML config templates loaded from
configs/dlio/workload/ - Parameters merged with precedence: CLI args > YAML config > environment variables
- Dotted-key parameters (e.g.,
dataset.num_files_train) flattened/unflattened for DLIO
Located in mlpstorage/rules/:
- Run Checkers (
run_checkers/) - Real-time validation during execution - Submission Checkers (
submission_checkers/) - Post-run compliance validation - BenchmarkVerifier (
verifier.py) - Orchestrates all validation - Validation states:
CLOSED,OPEN,INVALID(defined inconfig.pyasPARAM_VALIDATIONenum)
cluster_collector.py- MPI-based system information collection- Commands executed via
CommandExecutorinutils.pywith live output streaming - Supports both
mpirunandmpiexecvia--mpi-binflag
| File | Purpose |
|---|---|
mlpstorage/main.py |
Entry point with signal/error handling |
mlpstorage/benchmarks/base.py |
Abstract benchmark base class |
mlpstorage/benchmarks/__init__.py |
Benchmark registry initialization |
mlpstorage/config.py |
Constants, enums, model configurations |
mlpstorage/rules/models.py |
Data classes for validation pipeline |
mlpstorage/utils.py |
Command execution, JSON encoding, config loading |
- Create benchmark class inheriting from
Benchmark - Set
BENCHMARK_TYPEclass attribute - Implement
_run()method - Create CLI argument builder in
mlpstorage/cli/ - Register in
mlpstorage/benchmarks/__init__.pyviaBenchmarkRegistry.register()
Tests use pytest with fixtures in tests/fixtures/:
mock_collector.py- Mock cluster collectormock_executor.py- Mock command executormock_logger.py- Mock loggersample_data.py- Sample test data
When running the mlpstorage CLI for manual testing or integration tests, use:
- Data directory:
/databases/mlps-v3.0/data/ - Results directory:
/databases/mlps-v3.0/results/
# Generate dataset for unet3d with 4 processes
mlpstorage training datagen \
--model unet3d \
--num-processes 4 \
--data-dir /databases/mlps-v3.0/data/ \
--results-dir /databases/mlps-v3.0/results
# Run training benchmark for unet3d with 2 h100 accelerators
mlpstorage training run \
--model unet3d \
--num-accelerators 2 \
--accelerator-type h100 \
--client-host-memory-in-gb 64 \
--data-dir /databases/mlps-v3.0/data/ \
--results-dir /databases/mlps-v3.0/resultsNote: These benchmarks require MPI (OpenMPI) to be installed. Install with:
# Ubuntu/Debian
sudo apt-get install openmpi-bin
# RHEL/CentOS
sudo yum install openmpiFrom mlpstorage/config.py:
- Training models:
cosmoflow,resnet50,unet3d - LLM models (checkpointing):
llama3-8b,llama3-70b,llama3-405b,llama3-1t - Accelerators:
h100,a100 - Submission categories:
CLOSED,OPEN