Care Scribe Middleware

A FastAPI WebSocket service that relays live audio transcription between browser clients and Google Cloud Speech-to-Text v2.

CARE's Django backend cannot hold long-lived WebSocket connections, so this lightweight middleware sits between the frontend and Google's streaming STT gRPC API.

Architecture

┌──────────┐  JWT + audio (WS)  ┌──────────────────┐  gRPC streaming  ┌─────────────┐
│  Client  │ ─────────────────► │  care_scribe_mw  │ ───────────────► │  Google STT │
│ (browser)│ ◄───────────────── │   (FastAPI)      │ ◄─────────────── │   v2 API    │
└──────────┘    transcripts     └──────────────────┘    results       └─────────────┘

Quick Start

1. Configure

cp .env.example .env
# Edit .env — set JWT_SECRET_KEY (must match care_be's DJANGO_SECRET_KEY)
#              set GOOGLE_PROJECT_ID
#              set GOOGLE_APPLICATION_CREDENTIALS (or use ADC)

2. Run locally

pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 8090 --reload

WebSocket Protocol

Connect

ws://localhost:8090/ws/transcribe?token=<JWT_ACCESS_TOKEN>

The JWT is verified using the same secret key as care_be (HS256 + DJANGO_SECRET_KEY).

1. Send config (JSON text frame)

{
    "language": "en-US",
    "model": "long",
    "sample_rate": 16000,
    "interim_results": true
}

All fields are optional and fall back to server defaults.

2. Stream audio (binary frames)

Send raw PCM16 mono audio at the configured sample rate. Recommended chunk size: 100-200 ms (~3200-6400 bytes at 16 kHz).

3. Receive transcriptions (JSON text frames)

{"type": "ready"}
{"type": "transcript", "text": "hello world", "is_final": false}
{"type": "transcript", "text": "hello world!", "is_final": true, "confidence": 0.95}

4. Stop

Send {"type": "stop"} or simply close the WebSocket.

Health Check

GET /health → {"status": "ok"}

Environment Variables

Variable	Required	Default	Description
`JWT_SECRET_KEY`	yes	—	Must match `DJANGO_SECRET_KEY` in care_be
`JWT_ALGORITHM`	no	`HS256`	JWT signing algorithm
`GOOGLE_PROJECT_ID`	yes	—	GCP project ID
`GOOGLE_LOCATION`	no	`global`	GCP region for STT
`GOOGLE_APPLICATION_CREDENTIALS`	situational	—	Path to service account JSON (or use ADC)
`DEFAULT_LANGUAGE`	no	`en-US`	Default recognition language
`DEFAULT_MODEL`	no	`long`	Default recognition model
`DEFAULT_SAMPLE_RATE`	no	`16000`	Default audio sample rate (Hz)
`CORS_ALLOWED_ORIGINS`	no	`["*"]`	Allowed CORS origins

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
app		app
.env		.env
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Care Scribe Middleware

Architecture

Quick Start

1. Configure

2. Run locally

WebSocket Protocol

Connect

1. Send config (JSON text frame)

2. Stream audio (binary frames)

3. Receive transcriptions (JSON text frames)

4. Stop

Health Check

Environment Variables

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Care Scribe Middleware

Architecture

Quick Start

1. Configure

2. Run locally

WebSocket Protocol

Connect

1. Send config (JSON text frame)

2. Stream audio (binary frames)

3. Receive transcriptions (JSON text frames)

4. Stop

Health Check

Environment Variables

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages