A modern PyTorch implementation of Google's Performance RNN for generating expressive piano performances with dynamics and timing.
- Overview
- Features
- Installation
- Quick Start
- Project Structure
- Usage
- Configuration
- Development
- Citation
This repository contains a PyTorch implementation of Performance RNN, inspired by the work of Ian Simon and Sageev Oore on "Performance RNN: Generating Music with Expressive Timing and Dynamics" (Magenta Blog, 2017).
This implementation was developed as part of the educational activities for the "Artificial Models for Music Creativity" class at Hochschule fΓΌr Musik und Theater Hamburg (Winter Semester 2023/2024). For more resources, see the class repository.
- β¨ Modern PyTorch 2.0+ implementation with full GPU support
- πΉ Expressive generation with dynamics and timing control
- π― Flexible training with configurable batch sizes, early stopping, and checkpointing
- π TensorBoard integration for training visualization
- π§ Hyperparameter optimization using Optuna
- π Easy-to-use CLI scripts for all operations
- π¦ Proper Python packaging with pip installability
- π Cross-platform support (Linux, macOS, Windows)
- π Apple Silicon support with MPS acceleration
- Python 3.10 or higher
- Miniconda or Anaconda (download here)
# 1. Clone the repository
git clone https://github.com/anatrini/performance_rnn_torch.git
cd performance_rnn_torch
# 2. Create conda environment based on your hardware:
# Choose ONE of the following:
# For CPU-only systems:
conda env create -f environment-cpu.yml
# For NVIDIA GPU with CUDA:
conda env create -f environment-cuda.yml
# For Apple Silicon (M1/M2/M3) with MPS:
conda env create -f environment-mps.yml
# 3. Activate the environment
conda activate py_magenta
# 4. Install the package
pip install -e .That's it! You're ready to use Performance RNN.
If you want to contribute to the project:
# Clone the repository
git clone https://github.com/anatrini/performance_rnn_torch.git
cd performance_rnn_torch
# Create development environment (includes testing, linting, documentation tools)
conda env create -f environment-dev.yml
conda activate py_magenta
# Install the package in development mode
pip install -e ".[dev]"Note: For GPU development, use environment-cuda.yml or environment-mps.yml instead:
conda env create -f environment-cuda.yml # or environment-mps.yml
conda activate py_magenta
pip install -e ".[dev]"# The installation test will show the version and detected device
python -c "import performance_rnn_torch as prnn; print(f'Version: {prnn.__version__}'); print(f'Device: {prnn.config.device}')"Expected output:
Version: 1.0.0
Device: cpu # or 'cuda:0' for NVIDIA GPU, or 'mps' for Apple Silicon
Complete workflow from dataset to music generation:
# Activate your conda environment
conda activate py_magenta
# 1. Get MIDI files from Maestro dataset (example: Claude Debussy)
python scripts/prepare_data.py --composer "Claude Debussy"
# 2. Preprocess MIDI files into training data
python scripts/preprocess.py --num_workers -1
# 3. Optimize hyperparameters (optional but recommended, ~1-2 hours)
python scripts/optimization_routine.py --n-trials 20
# 4. Train the model with optimized settings
python scripts/train.py --session models/optimization.sess --num-epochs 50
# 5. Generate new music
python scripts/generate.py --session models/optimization.sess --num-samples 3Skip optimization? Use default hyperparameters instead:
python scripts/train.py --num-epochs 50
python scripts/generate.pyperformance_rnn_torch/ # Project root directory
β
βββ performance_rnn_torch/ # Python package (source code)
β βββ core/ # Core models and data handling
β β βββ model.py # PerformanceRNN neural network
β β βββ sequence.py # MIDI-to-event conversion
β β βββ data.py # Dataset class
β βββ training/ # Training utilities
β β βββ trainer.py # Training loop and helpers
β β βββ early_stopping.py # Early stopping callback
β βββ utils/ # Utility functions
β β βββ paths.py # Path management
β β βββ logger.py # Logging setup
β β βββ helpers.py # Helper functions
β βββ config.py # Global configuration
β βββ __init__.py # Package initialization
β
βββ scripts/ # Command-line scripts
β βββ prepare_data.py # Extract MIDI from Maestro dataset
β βββ preprocess.py # Convert MIDI to training data
β βββ train.py # Train the model
β βββ generate.py # Generate music
β βββ optimization_routine.py # Hyperparameter tuning
β
βββ data/ # Data directory (gitignored)
β βββ maestro-v3.0.0/ # Maestro dataset (download separately)
β βββ midi/ # Your MIDI files (organized by composer/folder)
β β βββ claude_debussy/ # Example: Debussy's pieces
β β βββ bach/ # Example: Bach's pieces
β β βββ ...
β βββ processed/ # Preprocessed training data (mirrors midi/ structure)
β β βββ claude_debussy/ # Processed Debussy files
β β βββ bach/ # Processed Bach files
β β βββ ...
β βββ scripts/ # Dataset download scripts (.sh)
β
βββ models/ # Trained model checkpoints (gitignored)
βββ output/ # Generated MIDI files (gitignored)
βββ logs/ # Training logs (gitignored)
βββ runs/ # TensorBoard logs (gitignored)
β
βββ environment-*.yml # Conda environment files
βββ setup.py # Package installation script
βββ pyproject.toml # Python project metadata
βββ requirements.txt # Python dependencies (pip fallback)
βββ README.md # This file
Key points:
performance_rnn_torch/(inner) = Python package with source codescripts/= Command-line tools you rundata/= All your data files (MIDI, preprocessed, datasets)- Folder structure is preserved:
data/midi/composer/βdata/processed/composer/
- Folder structure is preserved:
models/= Trained modelsoutput/= Generated music.egg-info/= Build artifact (auto-generated, gitignored)
Option A: Use Maestro Dataset (Recommended)
Download Maestro v3.0.0 and extract to data/maestro-v3.0.0/.
# List available composers
python scripts/prepare_data.py --list
# Extract MIDI files for a composer (auto-saves to data/midi/{composer}/)
python scripts/prepare_data.py --composer "Claude Debussy"Option B: Use Your Own MIDI Files
Place MIDI files in data/midi/ organized by subdirectories:
data/midi/
βββ composer_1/
β βββ piece1.mid
βββ composer_2/
βββ piece2.mid
Convert MIDI files to training data (reads from data/midi/, saves to data/processed/):
# Use all CPU cores (recommended)
python scripts/preprocess.py --num_workers -1
# Or use specific number of workers
python scripts/preprocess.py --num_workers 4Key Parameters:
--midi_root: Source directory (default:data/midi/)--save_dir: Output directory (default:data/processed/)--num_workers: Parallel workers (default: 1, use -1 for all cores)
Find optimal hyperparameters using Optuna (tests model architecture, batch size, learning rate, etc.):
# Quick optimization (~1-2 hours)
python scripts/optimization_routine.py --n-trials 20
# Thorough optimization (~5-10 hours, best results)
python scripts/optimization_routine.py --n-trials 100Best parameters are saved to models/optimization.sess for use in training.
Key Parameters:
-n,--n-trials: Number of optimization trials (default: 20)-d,--dataset: Preprocessed data path (default:data/processed/)-S,--session: Where to save results (default:models/optimization.sess)-L,--enable-logging: Enable TensorBoard logging for each trial
Train using optimized hyperparameters:
# Use optimized settings
python scripts/train.py --session models/optimization.sess --num-epochs 50
# Or use default hyperparameters
python scripts/train.py --num-epochs 50Monitor training: tensorboard --logdir runs/ (open http://localhost:6006)
Key Parameters:
--session, -S: Model checkpoint path (default:models/train.sess)--dataset, -d: Preprocessed data path (default:data/processed/)--batch-size, -b: Batch size (default: 64)--num-epochs, -e: Number of epochs (default: 24)--learning-rate, -l: Learning rate (default: 0.001)--window-size, -w: Sequence length (default: 200)
Generate new piano pieces from your trained model:
# Use the optimized model
python scripts/generate.py --session models/optimization.sess --num-samples 3
# Use default model
python scripts/generate.pyGenerated MIDI files are saved to output/.
Key Parameters:
--session, -S: Trained model path (default:models/train.sess)--output, -O: Output directory (default:output/)--num-samples, -n: Number of pieces (default: 1)--max-len, -l: Sequence length (default: 1000)--temperature, -t: Randomness/creativity (default: 1.0, range: 0.1-2.0)
Default settings are in performance_rnn_torch/config.py. Key configurations:
- Model:
hidden_dim(512),gru_layers(3),gru_dropout(0.3) - Training:
batch_size(64),num_epochs(24),window_size(200),learning_rate(0.001) - Generation:
max_len(1000),temperature(1.0)
Environment Variables (optional path customization):
export PERFORMANCE_RNN_DATA_DIR=/path/to/data
export PERFORMANCE_RNN_MODELS_DIR=/path/to/models
export PERFORMANCE_RNN_OUTPUT_DIR=/path/to/output# Install with development dependencies
pip install -e ".[dev]"
# Code quality
black performance_rnn_torch/ scripts/
flake8 performance_rnn_torch/ scripts/
# Run tests
pytest
# Build docs
cd docs/ && make html@misc{anatrini2024performancernn,
author = {Anatrini, Alessandro},
title = {Performance RNN in PyTorch},
year = {2024},
url = {https://github.com/anatrini/performance_rnn_torch}
}Original work: Performance RNN: Generating Music with Expressive Timing and Dynamics by Simon & Oore (2017)
MIT License - Alessandro Anatrini - Hochschule fΓΌr Musik und Theater Hamburg
For issues or contributions: GitHub Issues
