Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units
DiscoPhon is a multilingual benchmark evaluating unsupervised phoneme discovery from discrete speech units. Given only 10 hours of speech in an unseen language, models must produce discrete units that map to a predefined phoneme inventory.
- Install this package:
pip install discophon
- Follow the tutorials to download data, evaluate models, and prepare your submission.
- Current leaderboard.
@misc{poli2026discophon,
title={{DiscoPhon}: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units},
author={Maxime Poli and Manel Khentout and Angelo Ortiz Tandazo and Ewan Dunbar and Emmanuel Chemla and Emmanuel Dupoux},
year={2026},
eprint={2603.18612},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2603.18612},
}