Finetuning whisper-small for swahili speech to text.
photo credits: Devonyu
This project uses the small version of Whisper, a general-purpose speech recognition model created by OpenAI, to convert Swahili audio to text. Whisper is pretrained for ASR (Automatic speech Recognition) and speech translation on 680k hours of labelled data
- Github @JM_Rono
- Linked_in @John Michael Rono
A Data
B Machine learning
C Deploying
The data consist of about 82K instances of swahili audio form Mozilla common voice. I got the dataset from participating in a Zindi competition. Join the completed competiotion to get acces to the data.
- Check-out notebook: @notebook
The model had a WER score of 8.365 wandb
- Deployed at: https://swahilispeechtotext-htxcv6sjkjcazseovpfrkk.streamlit.app/
- Upload a 30 second audio or less


