This REST API is built on top of Mozilla's DeepSpeech. It is written based on examples provided by Mozilla. It accepts HTTP methods such as GET and POST as well as WebSocket. To perform transcription using HTTP methods is appropriate for relatively short audio files while the WebSocket can be used even for longer audio recordings.
- Clone the repository to your local machine and change directory to
deepspeech-rest-api
git clone https://github.com/fabricekwizera/deepspeech-rest-api.git
cd deepspeech-rest-api2. Create a virtual environment and activate it (assuming that it is installed your machine) and install the project in editable mode (locally).
virtualenv -p python3 venv
source venv/bin/activate
pip install --editable .- Download the model and the scorer. For English model and scorer, follow below links
wget https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.pbmm \
-O deepspeech_model.pbmm
wget https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.scorer \
-O deepspeech_model.scorerFor other languages, you can place the two files in the current working directory under the names deepspeech_model.pbmm for the
model and deepspeech_model.scorer for the scorer.
- Migrations are done using Alembic
- Running the server
python3 run.pyRegister a new user and request a new JWT token to access the API
curl -X POST \
http://0.0.0.0:8000/users \
-H 'Content-Type: application/json' \
-d '{
"username": "forrestgump",
"email": "[email protected]",
"password": "yourpassword"
}'API response
{
"message": "User forrestgump is successfully created."
}To generate a JWT token to access the API
curl -X POST \
http://0.0.0.0:8000/token \
-H 'Content-Type: application/json' \
-d '{
"username": "forrestgump",
"password": "yourpassword"
}'If both steps are done correctly, you should get a token in below format
{
"access_token": "JWT_token",
"refresh_token": "Refresh_token"
}With this JWT_token, you have access to different endpoints of the API, and the Refresh_token is used to refresh the access token
when it expires.
To refresh a JWT token
curl -X POST \
http://0.0.0.0:8000/token/refresh \
-H "Content-Type: application/json" \
-H "Authorization: Bearer JWT_token" \
-d '{
"refresh_token": "Refresh_token"
}'Change directory to audio and use the WAV files provided for testing.
- STT the HTTP way
cURL
curl -X POST \
http://0.0.0.0:8000/api/stt/http \
-H 'Authorization: Bearer JWT_token' \
-F '[email protected]' \
-F 'paris=-1000' \
-F 'power=1000' \
-F 'parents=-1000'python
import requests
jwt_token = 'JWT_token'
headers = {'Authorization': 'Bearer ' + jwt_token}
hot_words = {'paris': -1000, 'power': 1000, 'parents': -1000}
audio_filename = 'audio/8455-210777-0068.wav'
audio = [('audio', open(audio_filename, 'rb'))]
url = 'http://0.0.0.0:8000/api/stt/http'
response = requests.post(url, data=hot_words, files=audio, headers=headers)
print(response.json())Note the usage of hot-words and their boosts in the request.
- STT the WebSocket way (simple test)
WebSockets don't support curl. To take advantage of this feature, you will have to write a web app to send request to ws://0.0.0.0:8000/api/stt/ws.
Below command can be used to check if the WebSocket is running.
python3 test_websocket.pyIn the both cases (HTTP and WebSocket), you should get a result in below format.
{
"message": "experience proves this",
"time": 1.4718825020026998
}