DeepSpeech REST API

This REST API is built on top of Mozilla's DeepSpeech. It is written based on examples provided by Mozilla. It accepts HTTP methods such as GET and POST as well as WebSocket. To perform transcription using HTTP methods is appropriate for relatively short audio files while the WebSocket can be used even for longer audio recordings.

Project setup

Clone the repository to your local machine and change directory to deepspeech-rest-api

git clone https://github.com/fabricekwizera/deepspeech-rest-api.git
cd deepspeech-rest-api

2. Create a virtual environment and activate it (assuming that it is installed your machine) and install the project in editable mode (locally).

virtualenv -p python3 venv
source venv/bin/activate
pip install --editable .

Download the model and the scorer. For English model and scorer, follow below links

wget https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.pbmm \
    -O deepspeech_model.pbmm
wget https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.scorer \
    -O deepspeech_model.scorer

For other languages, you can place the two files in the current working directory under the names deepspeech_model.pbmm for the model and deepspeech_model.scorer for the scorer.

Migrations are done using Alembic

Running the server

python3 run.py

Usage of the API

Register a new user and request a new JWT token to access the API

curl -X POST \
http://0.0.0.0:8000/users \
-H 'Content-Type: application/json' \
-d '{
"username": "forrestgump",
"email": "[email protected]",
"password": "yourpassword"
}'

API response

{
  "message": "User forrestgump is successfully created."
}

To generate a JWT token to access the API

curl -X POST \
http://0.0.0.0:8000/token \
-H 'Content-Type: application/json' \
-d '{
"username": "forrestgump",
"password": "yourpassword"
}'

If both steps are done correctly, you should get a token in below format

{
    "access_token": "JWT_token",
    "refresh_token": "Refresh_token"
}

With this JWT_token, you have access to different endpoints of the API, and the Refresh_token is used to refresh the access token when it expires.

To refresh a JWT token

curl -X POST \
http://0.0.0.0:8000/token/refresh \
-H "Content-Type: application/json" \
-H "Authorization: Bearer JWT_token" \
-d '{
    "refresh_token": "Refresh_token"
}'

Performing STT (Speech-To-Text)

Change directory to audio and use the WAV files provided for testing.

STT the HTTP way

cURL

curl -X POST \
http://0.0.0.0:8000/api/stt/http \
-H 'Authorization: Bearer JWT_token' \
-F '[email protected]' \
-F 'paris=-1000' \
-F 'power=1000' \
-F 'parents=-1000'

python

import requests

jwt_token = 'JWT_token'
headers = {'Authorization': 'Bearer ' + jwt_token}
hot_words = {'paris': -1000, 'power': 1000, 'parents': -1000}
audio_filename = 'audio/8455-210777-0068.wav'
audio = [('audio', open(audio_filename, 'rb'))]
url = 'http://0.0.0.0:8000/api/stt/http'
response = requests.post(url, data=hot_words, files=audio, headers=headers)
print(response.json())

Note the usage of hot-words and their boosts in the request.

STT the WebSocket way (simple test)

WebSockets don't support curl. To take advantage of this feature, you will have to write a web app to send request to ws://0.0.0.0:8000/api/stt/ws.

Below command can be used to check if the WebSocket is running.

python3 test_websocket.py

In the both cases (HTTP and WebSocket), you should get a result in below format.

{
  "message": "experience proves this",
  "time": 1.4718825020026998
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
app		app
audio		audio
migrations		migrations
.dockerignore		.dockerignore
.env		.env
.gitignore		.gitignore
Dockerfile		Dockerfile
README.rst		README.rst
alembic.ini		alembic.ini
config.py		config.py
docker-compose.yaml		docker-compose.yaml
requirements.txt		requirements.txt
run.py		run.py
setup.py		setup.py
test_websocket.py		test_websocket.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DeepSpeech REST API

Project setup

Usage of the API

Performing STT (Speech-To-Text)

About

Uh oh!

Releases

Packages

Languages

JRMeyer/deepspeech-rest-api

Folders and files

Latest commit

History

Repository files navigation

DeepSpeech REST API

Project setup

Usage of the API

Performing STT (Speech-To-Text)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages