This repository contains the code and materials for a tutorial on Joint Embedding Architectures for Music Self-Supervised Learning presented at the 25th International Society for Music Information Retrieval Conference (ISMIR 2025). The tutorial focuses on contrastive learning approaches for learning music representations from audio data.
This tutorial explores state-of-the-art self-supervised learning techniques specifically designed for music audio data. We cover joint embedding architectures that learn meaningful representations by contrasting different views of the same musical content, enabling powerful music understanding without requiring large amounts of labeled data.
- Contrastive Learning for Music: Learning representations by contrasting positive and negative pairs
- Multi-View Augmentation: Creating different views of the same musical content
- Joint Embedding Architectures: Building encoders that map audio to shared embedding spaces
- Music-Specific Data Processing: Mel spectrograms, audio augmentation, and temporal modeling
- Evaluation and Visualization: Assessing learned representations and understanding what the model learns
-
Clone the repository:
git clone https://github.com/your-username/Music-SSL-ISMIR.git cd Music-SSL-ISMIR -
Run the setup script:
This will:
- Create a conda environment named
ismir_ssl_2025 - Install all required dependencies
- Download the GiantSteps dataset
- Set up the project structure
- Create a conda environment named
-
Activate the environment:
conda activate ismir_ssl_2025
Music-SSL-ISMIR/
βββ src/
β βββ models/
β β βββ backbones.py # MLP and other backbone architectures
β β βββ training_wrappers.py # Training loops and model wrappers
β βββ data/
β β βββ dataset.py # Dataset classes and data loaders
β β βββ collate.py # Custom collate functions for batching
β βββ utils/
β β βββ losses.py # Contrastive loss functions
β β βββ viz.py # Visualization utilities
β β βββ utils.py # General utility functions
β βββ train.py # Main training script
βββ scripts/
β βββ download_giantsteps.py # Dataset download script
βββ data/
β βββ giantsteps/ # Downloaded dataset (created after setup)
βββ requirements.txt # Python dependencies
βββ setup.sh # Automated setup script
βββ README.md # This file
After completing this tutorial, you will understand:
We welcome contributions! Please feel free to:
- Report bugs or issues
- Suggest improvements
- Submit pull requests
- Share your results and experiments
If you use this code in your research, please cite:
@misc{music-ssl-ismir2025,
title={Self-Supervised Learning for Music - an Overview and new horizons},
author={Julien Guinot, Alain Riou, Marco Pasini, Yuexuan Kong, Gabriel Meseguer-Brocal, Stefan Lattner},
year={2025},
howpublished={Tutorial at ISMIR 2025},
url={https://github.com/your-username/Music-SSL-ISMIR}
}This project is licensed under the MIT License - see the LICENSE file for details.
- The ISMIR community for fostering music information retrieval research
- The authors of the original contrastive learning papers
- The torchaudio and librosa teams for excellent audio processing libraries
- The GiantSteps dataset creators for providing high-quality music data
Happy Learning! π΅π€
For questions or support, please open an issue or contact the tutorial organizers.