Minimal example on how to use pre-trained ASR models for audio transcriptions on a laptop (without GPUs) #3553

okuchaiev · 2022-01-28T19:25:47Z

okuchaiev
Jan 28, 2022
Collaborator

NeMo is a toolkit for training and fine-tuning Conversational AI models. However, sometimes, for quick prototyping purposes many of the pre-trained NeMo models can be used directly from NeMo on CPUs. For real production deployment of NeMo ASR models we recommend NVIDIA Riva.

The below steps will work on (Intel) MacBook without NVIDIA GPU:
In your terminal, first install Anaconda and then perform the following steps to install NeMo and its dependencies:

conda create -n cputest python=3.8
conda activate cputest
pip install nemo_toolkit['all']==1.6.2

Get a sample audio file. You can use your own, just make sure it is Mono and sampled at 16Khz.

wget https://dldata-public.s3.us-east-2.amazonaws.com/2086-149220-0033.wav

Start a Python shell and do:

import nemo.collections.asr as nemo_asr
# This will initiate pre-trained model download from NGC
asr_model = nemo_asr.models.EncDecCTCModelBPE.from_pretrained("stt_en_conformer_ctc_large")
transcriptions = asr_model.transcribe(['2086-149220-0033.wav'])
print(transcriptions)
["well i don't wish to see it any more observed phoebe turning away her eyes it is certainly very like the old portrait"]

# Now, lets add punctuation
import nemo.collections.nlp as nemo_nlp 
# this will also trigger model download
punctuation = nemo_nlp.models.PunctuationCapitalizationModel.from_pretrained(model_name='punctuation_en_distilbert')
res = punctuation.add_punctuation_capitalization(queries=transcriptions)
print(res)
["Well, I don't wish to see it any more, observed Phoebe, turning away her eyes. It is certainly very like the old portrait."]

TIP: to see a list of available pre-trained ASR models, in a Python shell do:

nemo_asr.models.EncDecCTCModelBPE.list_available_models()

burgil · 2022-01-29T05:33:16Z

burgil
Jan 29, 2022

This is dope

0 replies

burgil · 2022-01-29T08:29:31Z

burgil
Jan 29, 2022

I had to comment again, to say thank you for this awesome snippet!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Minimal example on how to use pre-trained ASR models for audio transcriptions on a laptop (without GPUs) #3553

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Minimal example on how to use pre-trained ASR models for audio transcriptions on a laptop (without GPUs) #3553

Uh oh!

Uh oh!

okuchaiev Jan 28, 2022 Collaborator

Replies: 2 comments

Uh oh!

Uh oh!

burgil Jan 29, 2022

Uh oh!

burgil Jan 29, 2022

okuchaiev
Jan 28, 2022
Collaborator

burgil
Jan 29, 2022

burgil
Jan 29, 2022