LASR: a lighting ASR model training platform

We believe the existing ASR training is too complex, so we want to design a ASR training platform as lighting as possible. With LASR, you can train your asr model by only providing the wav list and text list. All of the training details can be modified in the config.yaml. You can also train your own torch model by warp it with the LASR interface.

The Recommended Configuration

LASR is based on python and pytorch, but we recommend the below configuration for use.

Python3.7+
PyTorch 1.8.1+
torchaudio
editdistance (eval the ASR results, install with pip)
soundfile (if you want to read raw wav file or flac file, install with pip)
librosa (if you want to read raw wav file or flac file, install with pip)
jiwer (if you want to evaluate the asr results in python)
pytorch lightning (We recommand to use lighting to train the asr model, but you can also train the ASR model by yourself)

Install Guideline

Without install

If all recommended configurations are satisfied, LASR can be directly used by only adding it to the PYTHONPATH environment variable.

export PYTHONPATH=/path/to/lasrfolder/:$PYTHONPATH

Install from source

git clone https://github.com/gaochangfeng/lighting-asr.git
cd lighting-asr
python setup.py install

Install by pip

pip install git+https://github.com/gaochangfeng/lighting-asr.git

Use Guideline

Prepare the training data

We use the Kaldi style scp file as input, wav.scp and text are needed. (the blank symbol is space)

Format of the wav.scp

wai_id_1 /path/to/auido1.wav
wai_id_2 /path/to/auido1.wav

Format of the text

wai_id_1 HELLO
wai_id_2 A NICE DAY

Edit the config.yaml

In LASR, model, optimizer, criterion, tokenizer and even training data will be dymatic imported according to the config.yaml, in other word, you can import any python API by the format beyound:

model_config:
 name: torch.nn:Linear
 kwargs:
  in_features: 10
  in_features: 20

For the LASR training, you need to give the model_config, opti_config, criterion_config, tokenizer_config, train_data_config and valid_data_config. We also provide some API for using. If you want to use your own API, we suggest to use our API as the interface (especially for tokenizer and model). Details see the example.

Training

You can use the bin/train_lighting.py for training.

Decoding

For decoding, you can define the some parameters in the decode.yaml.

You can use the bin/decode_lighting.py to evaluate your model on the test dataset. But if you only want to recognize some audio, you can just use the lasr.process.asrprocess.ASRProcess as follow:

from lasr.process.asrprocess import ASRProcess

train_config="/path/to/train_config.yaml" 
decode_config="/path/to/decode_config.yaml"
model_path="/path/to/model.ckpt"
asrpipeline = ASRProcess(
    train_config=train_config, 
    decode_config=decode_config, 
    model_path=model_path
)
token, text = asrpipeline("test.wav")
print(token)
print(text)

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
bin		bin
example		example
lasr		lasr
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LASR: a lighting ASR model training platform

The Recommended Configuration

Install Guideline

Without install

Install from source

Install by pip

Use Guideline

Prepare the training data

Format of the wav.scp

Format of the text

Edit the config.yaml

Training

Decoding

Open Source Model

lighting-asr-zh-cn

lighting-asr-en

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

gaochangfeng/lighting-asr

Folders and files

Latest commit

History

Repository files navigation

LASR: a lighting ASR model training platform

The Recommended Configuration

Install Guideline

Without install

Install from source

Install by pip

Use Guideline

Prepare the training data

Format of the wav.scp

Format of the text

Edit the config.yaml

Training

Decoding

Open Source Model

lighting-asr-zh-cn

lighting-asr-en

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages