Skip to content

egovhealthcare/care_scribe_middleware

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Care Scribe Middleware

A FastAPI WebSocket service that relays live audio transcription between browser clients and Google Cloud Speech-to-Text v2.

CARE's Django backend cannot hold long-lived WebSocket connections, so this lightweight middleware sits between the frontend and Google's streaming STT gRPC API.

Architecture

┌──────────┐  JWT + audio (WS)  ┌──────────────────┐  gRPC streaming  ┌─────────────┐
│  Client  │ ─────────────────► │  care_scribe_mw  │ ───────────────► │  Google STT │
│ (browser)│ ◄───────────────── │   (FastAPI)      │ ◄─────────────── │   v2 API    │
└──────────┘    transcripts     └──────────────────┘    results       └─────────────┘

Quick Start

1. Configure

cp .env.example .env
# Edit .env — set JWT_SECRET_KEY (must match care_be's DJANGO_SECRET_KEY)
#              set GOOGLE_PROJECT_ID
#              set GOOGLE_APPLICATION_CREDENTIALS (or use ADC)

2. Run locally

pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 8090 --reload

WebSocket Protocol

Connect

ws://localhost:8090/ws/transcribe?token=<JWT_ACCESS_TOKEN>

The JWT is verified using the same secret key as care_be (HS256 + DJANGO_SECRET_KEY).

1. Send config (JSON text frame)

{
    "language": "en-US",
    "model": "long",
    "sample_rate": 16000,
    "interim_results": true
}

All fields are optional and fall back to server defaults.

2. Stream audio (binary frames)

Send raw PCM16 mono audio at the configured sample rate. Recommended chunk size: 100-200 ms (~3200-6400 bytes at 16 kHz).

3. Receive transcriptions (JSON text frames)

{"type": "ready"}
{"type": "transcript", "text": "hello world", "is_final": false}
{"type": "transcript", "text": "hello world!", "is_final": true, "confidence": 0.95}

4. Stop

Send {"type": "stop"} or simply close the WebSocket.

Health Check

GET /health → {"status": "ok"}

Environment Variables

Variable Required Default Description
JWT_SECRET_KEY yes Must match DJANGO_SECRET_KEY in care_be
JWT_ALGORITHM no HS256 JWT signing algorithm
GOOGLE_PROJECT_ID yes GCP project ID
GOOGLE_LOCATION no global GCP region for STT
GOOGLE_APPLICATION_CREDENTIALS situational Path to service account JSON (or use ADC)
DEFAULT_LANGUAGE no en-US Default recognition language
DEFAULT_MODEL no long Default recognition model
DEFAULT_SAMPLE_RATE no 16000 Default audio sample rate (Hz)
CORS_ALLOWED_ORIGINS no ["*"] Allowed CORS origins

About

A middleware server to support live websockets connection to Scribe

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages