Skip to content

anatrini/performance_rnn_torch

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

119 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Performance RNN in PyTorch

score

A modern PyTorch implementation of Google's Performance RNN for generating expressive piano performances with dynamics and timing.

License: MIT Python 3.8+ PyTorch 2.0+

Table of Contents

Overview

This repository contains a PyTorch implementation of Performance RNN, inspired by the work of Ian Simon and Sageev Oore on "Performance RNN: Generating Music with Expressive Timing and Dynamics" (Magenta Blog, 2017).

This implementation was developed as part of the educational activities for the "Artificial Models for Music Creativity" class at Hochschule fΓΌr Musik und Theater Hamburg (Winter Semester 2023/2024). For more resources, see the class repository.

Features

  • ✨ Modern PyTorch 2.0+ implementation with full GPU support
  • 🎹 Expressive generation with dynamics and timing control
  • 🎯 Flexible training with configurable batch sizes, early stopping, and checkpointing
  • πŸ“Š TensorBoard integration for training visualization
  • πŸ”§ Hyperparameter optimization using Optuna
  • πŸƒ Easy-to-use CLI scripts for all operations
  • πŸ“¦ Proper Python packaging with pip installability
  • 🌍 Cross-platform support (Linux, macOS, Windows)
  • 🍎 Apple Silicon support with MPS acceleration

Installation

Requirements

Quick Install

# 1. Clone the repository
git clone https://github.com/anatrini/performance_rnn_torch.git
cd performance_rnn_torch

# 2. Create conda environment based on your hardware:
#    Choose ONE of the following:

# For CPU-only systems:
conda env create -f environment-cpu.yml

# For NVIDIA GPU with CUDA:
conda env create -f environment-cuda.yml

# For Apple Silicon (M1/M2/M3) with MPS:
conda env create -f environment-mps.yml

# 3. Activate the environment
conda activate py_magenta

# 4. Install the package
pip install -e .

That's it! You're ready to use Performance RNN.

Development Installation

If you want to contribute to the project:

# Clone the repository
git clone https://github.com/anatrini/performance_rnn_torch.git
cd performance_rnn_torch

# Create development environment (includes testing, linting, documentation tools)
conda env create -f environment-dev.yml
conda activate py_magenta

# Install the package in development mode
pip install -e ".[dev]"

Note: For GPU development, use environment-cuda.yml or environment-mps.yml instead:

conda env create -f environment-cuda.yml  # or environment-mps.yml
conda activate py_magenta
pip install -e ".[dev]"

Verifying Installation

# The installation test will show the version and detected device
python -c "import performance_rnn_torch as prnn; print(f'Version: {prnn.__version__}'); print(f'Device: {prnn.config.device}')"

Expected output:

Version: 1.0.0
Device: cpu  # or 'cuda:0' for NVIDIA GPU, or 'mps' for Apple Silicon

Quick Start

Complete workflow from dataset to music generation:

# Activate your conda environment
conda activate py_magenta

# 1. Get MIDI files from Maestro dataset (example: Claude Debussy)
python scripts/prepare_data.py --composer "Claude Debussy"

# 2. Preprocess MIDI files into training data
python scripts/preprocess.py --num_workers -1

# 3. Optimize hyperparameters (optional but recommended, ~1-2 hours)
python scripts/optimization_routine.py --n-trials 20

# 4. Train the model with optimized settings
python scripts/train.py --session models/optimization.sess --num-epochs 50

# 5. Generate new music
python scripts/generate.py --session models/optimization.sess --num-samples 3

Skip optimization? Use default hyperparameters instead:

python scripts/train.py --num-epochs 50
python scripts/generate.py

Project Structure

performance_rnn_torch/              # Project root directory
β”‚
β”œβ”€β”€ performance_rnn_torch/          # Python package (source code)
β”‚   β”œβ”€β”€ core/                       # Core models and data handling
β”‚   β”‚   β”œβ”€β”€ model.py               # PerformanceRNN neural network
β”‚   β”‚   β”œβ”€β”€ sequence.py            # MIDI-to-event conversion
β”‚   β”‚   └── data.py                # Dataset class
β”‚   β”œβ”€β”€ training/                   # Training utilities
β”‚   β”‚   β”œβ”€β”€ trainer.py             # Training loop and helpers
β”‚   β”‚   └── early_stopping.py      # Early stopping callback
β”‚   β”œβ”€β”€ utils/                      # Utility functions
β”‚   β”‚   β”œβ”€β”€ paths.py               # Path management
β”‚   β”‚   β”œβ”€β”€ logger.py              # Logging setup
β”‚   β”‚   └── helpers.py             # Helper functions
β”‚   β”œβ”€β”€ config.py                   # Global configuration
β”‚   └── __init__.py                 # Package initialization
β”‚
β”œβ”€β”€ scripts/                        # Command-line scripts
β”‚   β”œβ”€β”€ prepare_data.py            # Extract MIDI from Maestro dataset
β”‚   β”œβ”€β”€ preprocess.py              # Convert MIDI to training data
β”‚   β”œβ”€β”€ train.py                   # Train the model
β”‚   β”œβ”€β”€ generate.py                # Generate music
β”‚   └── optimization_routine.py    # Hyperparameter tuning
β”‚
β”œβ”€β”€ data/                           # Data directory (gitignored)
β”‚   β”œβ”€β”€ maestro-v3.0.0/            # Maestro dataset (download separately)
β”‚   β”œβ”€β”€ midi/                      # Your MIDI files (organized by composer/folder)
β”‚   β”‚   β”œβ”€β”€ claude_debussy/        # Example: Debussy's pieces
β”‚   β”‚   β”œβ”€β”€ bach/                  # Example: Bach's pieces
β”‚   β”‚   └── ...
β”‚   β”œβ”€β”€ processed/                 # Preprocessed training data (mirrors midi/ structure)
β”‚   β”‚   β”œβ”€β”€ claude_debussy/        # Processed Debussy files
β”‚   β”‚   β”œβ”€β”€ bach/                  # Processed Bach files
β”‚   β”‚   └── ...
β”‚   └── scripts/                   # Dataset download scripts (.sh)
β”‚
β”œβ”€β”€ models/                         # Trained model checkpoints (gitignored)
β”œβ”€β”€ output/                         # Generated MIDI files (gitignored)
β”œβ”€β”€ logs/                           # Training logs (gitignored)
β”œβ”€β”€ runs/                           # TensorBoard logs (gitignored)
β”‚
β”œβ”€β”€ environment-*.yml               # Conda environment files
β”œβ”€β”€ setup.py                        # Package installation script
β”œβ”€β”€ pyproject.toml                  # Python project metadata
β”œβ”€β”€ requirements.txt                # Python dependencies (pip fallback)
└── README.md                       # This file

Key points:

  • performance_rnn_torch/ (inner) = Python package with source code
  • scripts/ = Command-line tools you run
  • data/ = All your data files (MIDI, preprocessed, datasets)
    • Folder structure is preserved: data/midi/composer/ β†’ data/processed/composer/
  • models/ = Trained models
  • output/ = Generated music
  • .egg-info/ = Build artifact (auto-generated, gitignored)

Usage

Data Preparation

Option A: Use Maestro Dataset (Recommended)

Download Maestro v3.0.0 and extract to data/maestro-v3.0.0/.

# List available composers
python scripts/prepare_data.py --list

# Extract MIDI files for a composer (auto-saves to data/midi/{composer}/)
python scripts/prepare_data.py --composer "Claude Debussy"

Option B: Use Your Own MIDI Files

Place MIDI files in data/midi/ organized by subdirectories:

data/midi/
β”œβ”€β”€ composer_1/
β”‚   β”œβ”€β”€ piece1.mid
└── composer_2/
    └── piece2.mid

Preprocessing

Convert MIDI files to training data (reads from data/midi/, saves to data/processed/):

# Use all CPU cores (recommended)
python scripts/preprocess.py --num_workers -1

# Or use specific number of workers
python scripts/preprocess.py --num_workers 4

Key Parameters:

  • --midi_root: Source directory (default: data/midi/)
  • --save_dir: Output directory (default: data/processed/)
  • --num_workers: Parallel workers (default: 1, use -1 for all cores)

Hyperparameter Optimization (Recommended)

Find optimal hyperparameters using Optuna (tests model architecture, batch size, learning rate, etc.):

# Quick optimization (~1-2 hours)
python scripts/optimization_routine.py --n-trials 20

# Thorough optimization (~5-10 hours, best results)
python scripts/optimization_routine.py --n-trials 100

Best parameters are saved to models/optimization.sess for use in training.

Key Parameters:

  • -n, --n-trials: Number of optimization trials (default: 20)
  • -d, --dataset: Preprocessed data path (default: data/processed/)
  • -S, --session: Where to save results (default: models/optimization.sess)
  • -L, --enable-logging: Enable TensorBoard logging for each trial

Training

Train using optimized hyperparameters:

# Use optimized settings
python scripts/train.py --session models/optimization.sess --num-epochs 50

# Or use default hyperparameters
python scripts/train.py --num-epochs 50

Monitor training: tensorboard --logdir runs/ (open http://localhost:6006)

Key Parameters:

  • --session, -S: Model checkpoint path (default: models/train.sess)
  • --dataset, -d: Preprocessed data path (default: data/processed/)
  • --batch-size, -b: Batch size (default: 64)
  • --num-epochs, -e: Number of epochs (default: 24)
  • --learning-rate, -l: Learning rate (default: 0.001)
  • --window-size, -w: Sequence length (default: 200)

Generation

Generate new piano pieces from your trained model:

# Use the optimized model
python scripts/generate.py --session models/optimization.sess --num-samples 3

# Use default model
python scripts/generate.py

Generated MIDI files are saved to output/.

Key Parameters:

  • --session, -S: Trained model path (default: models/train.sess)
  • --output, -O: Output directory (default: output/)
  • --num-samples, -n: Number of pieces (default: 1)
  • --max-len, -l: Sequence length (default: 1000)
  • --temperature, -t: Randomness/creativity (default: 1.0, range: 0.1-2.0)

Configuration

Default settings are in performance_rnn_torch/config.py. Key configurations:

  • Model: hidden_dim (512), gru_layers (3), gru_dropout (0.3)
  • Training: batch_size (64), num_epochs (24), window_size (200), learning_rate (0.001)
  • Generation: max_len (1000), temperature (1.0)

Environment Variables (optional path customization):

export PERFORMANCE_RNN_DATA_DIR=/path/to/data
export PERFORMANCE_RNN_MODELS_DIR=/path/to/models
export PERFORMANCE_RNN_OUTPUT_DIR=/path/to/output

Development

# Install with development dependencies
pip install -e ".[dev]"

# Code quality
black performance_rnn_torch/ scripts/
flake8 performance_rnn_torch/ scripts/

# Run tests
pytest

# Build docs
cd docs/ && make html

Citation

@misc{anatrini2024performancernn,
  author = {Anatrini, Alessandro},
  title = {Performance RNN in PyTorch},
  year = {2024},
  url = {https://github.com/anatrini/performance_rnn_torch}
}

Original work: Performance RNN: Generating Music with Expressive Timing and Dynamics by Simon & Oore (2017)

License & Contact

MIT License - Alessandro Anatrini - Hochschule fΓΌr Musik und Theater Hamburg

For issues or contributions: GitHub Issues


About

Modern PyTorch implementation of Google's Performance RNN for generating expressive piano performances with dynamics and timing. Features hyperparameter optimization with Optuna, TensorBoard integration, GPU/MPS acceleration, and easy-to-use CLI tools. Developed for the "Artificial Models for Music Creativity" class at HfMT Hamburg.

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 98.1%
  • Shell 1.9%