This is the implementation of our paper AI Harmonizer: Expanding Vocal Expression with a Generative Neurosymbolic Music AI System. It is based on the amazing project RVC-Project/Retrieval-based-Voice-Conversion-WebUI.
Caution
This repository uses by default an Anticipatory Music Transformer (AMT) finetuned on the JSB Chorales dataset, which is accessible here: https://huggingface.co/mitmedialab/jsbChorales-1000. As such, it is heavily biased towards baroque music. If you would like to explore other genres, we invite you to finetune AMT on another four-part harmony dataset.
- Make sure that you clone this repository along with its submodules:
git clone --recurse-submodules https://github.com/mitmedialab/ai-harmonizer-nime2025.git
-
Install voice models following the instructions of the RVC project.
-
Run the
run.shscript.
./run.sh
- In the Gradio interface that opens up, select your voice model, load an audio file, and click "Convert!"
This project is made possible thanks to:
@article{nime2025_84,
title = {AI Harmonizer: Expanding Vocal Expression with a Generative Neurosymbolic Music AI System},
author = {Lancelot Blanchard and Cameron Holt and Joseph Paradiso},
booktitle = {Proceedings of the International Conference on New Interfaces for Musical Expression},
address = {Canberra, Australia},
articleno = {84},
doi = {10.5281/zenodo.15698966},
editor = {Doga Cavdir and Florent Berthaut},
issn = {2220-4806},
month = {June},
numpages = {4},
pages = {578--581},
track = {Paper},
url = {http://nime.org/proceedings/2025/nime2025_84.pdf},
year = {2025}
}