GitHub - csalt-research/accented-codebooks-asr

Accented Speech Recognition With Accent-specific Codebooks

Empirical Methods in Natural Language Processing(EMNLP) 2023

About The Repository

This repository hosts the artefacts pertaining to our paper Accented Speech Recognition With Accent-specific Codebooks accepted to the main conference of EMNLP 2023.

The main contributions of our paper are as follows:

🔎 A new accent adaptation technique that uses a set of learnable codebooks and a new beam-search decoding algorithm to achieve significant performance improvement on both seen and unseen accents.

✅ Reproducible splits on Commonvoice dataset for accented ASR setup to facilitate fair comparisons across existing and new accent adaptation techniques.

Getting Started

The repository contains two folders:

data 📁 - Contains the train, dev and test splits used for all our experiments. Additionally, the folder also contians scripts used to generate those splits. More details can be found here.
espnet_code 📁 - Contains code to run our experiments on ESPnet toolkit. Detailed instruction on how to run our experiments can be found here.

Prerequisites and Installation

ESPnet installation: Follow the instructions here.
Clone the repository containing our code and dataset.

git clone https://github.com/csalt-research/accented-codebooks-asr.git

Additionally, to run the dataset creation script, run the following:

pip install -r accented-codebooks-asr/data/requirements.txt

Training

Extract the csvs from the tar file in data folder

tar  -xvzf accented-codebooks-asr/data/dataset.tar.gz

Copy the files from espnet_code into ESPnet egs

cp -r accented-codebooks-asr/espnet_code/* <espnet_root_folder>/egs/commonvoice/asr1

Enter the path to the the directory hosting our splits in run.sh

csvdir=  # Path to the directory hosting all our csvs.

Run the script

./run.sh

Dataset Statistics

The statistics of train, dev and test splits used in our experiments are as follows:

Accent	Train 100h (in hours)	Train (in hours)	Dev (in hours)	Test (in hours)
Australia	6.95	45.36	4.33	0.46
Canada	6.79	41.13	1.16	1.21
England	19.51	119.9	3.22	1.65
Scotland	2.69	16.21	0.23	0.16
US	64.12	400.1	8.32	4.87
Africa	-	-	-	1.71
Hongkong	-	-	-	0.52
India	-	-	-	0.58
Ireland	-	-	-	1.94
Malaysia	-	-	-	0.39
Newzealand	-	-	-	2.11
Philippines	-	-	-	0.90
Singapore	-	-	-	0.64
Wales	-	-	-	0.27

Roadmap

See the open issues for a list of proposed features (and known issues) relevant to this work. For ESPnet related features/issues, checkout their github repository.

Contributing

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.

If you have suggestions for adding or removing projects, feel free to open an issue to discuss it, or directly create a pull request after you edit the README.md file with necessary changes.
Please open an individual PR for each suggestion.

Creating A Pull Request

Fork the Project
Create your Feature Branch (git checkout -b feature/NewFeature)
Commit your Changes (git commit -m 'Add appropriate commit message'). The correct way to write your commit message can be found here
Push to the Branch (git push origin feature/NewFeature)
Open a Pull Request

Authors

Darshan Prabhu - M.Tech, CSE, IIT Bombay - Darshan Prabhu
Preethi Jyothi - Associate Professor, CSE, IIT Bombay - Preethi Jyothi
Sriram Ganapathy - Associate Professor, EE, IISc Bangalore - Sriram Ganapathy
Vinit Unni - Ph.D, CSE, IIT Bombay - Vinit Unni

Citation

If you use this code for your research, please consider citing our work.

@misc{prabhu2023accented,
      title={Accented Speech Recognition With Accent-specific Codebooks}, 
      author={Darshan Prabhu and Preethi Jyothi and Sriram Ganapathy and Vinit Unni},
      year={2023},
      eprint={2310.15970},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

License

Distributed under the MIT License. See LICENSE for more information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Accented Speech Recognition With Accent-specific Codebooks

Table Of Contents

About The Repository

Getting Started

Prerequisites and Installation

Training

Dataset Statistics

Roadmap

Contributing

Creating A Pull Request

Authors

Citation

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
espnet_code		espnet_code
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Accented Speech Recognition With Accent-specific Codebooks

Table Of Contents

About The Repository

Getting Started

Prerequisites and Installation

Training

Dataset Statistics

Roadmap

Contributing

Creating A Pull Request

Authors

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages