You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Audio encoding options (via `DEEPGRAM_ENCODING` env var or `encoding` URL parameter):
105
105
-**linear16** (default): Sends decoded PCM audio at 24kHz, 16-bit, mono. Uses more CPU for Opus decoding but universally compatible.
106
106
-**opus**: Sends raw Opus frames at 48kHz. More efficient (skips decoding step), lower CPU usage, native Opus support.
107
+
-**ogg-opus**: Sends containerized Ogg-Opus audio (e.g., from Voximplant). Deepgram auto-detects encoding from the container header - no `encoding` or `sample_rate` params are sent to Deepgram.
107
108
- Returns both interim and final transcriptions
108
109
- Supports KeepAlive, Finalize, and CloseStream control messages
109
110
- Authentication via Sec-WebSocket-Protocol header
All backends receive **24 kHz, 16-bit, mono PCM audio** encoded as base64 strings.
157
+
By default, backends receive **24 kHz, 16-bit, mono PCM audio** encoded as base64 strings.
146
158
147
159
The opus-transcriber-proxy handles:
148
160
1. Receiving Opus-encoded packets from clients
149
-
2. Decoding to PCM
150
-
3. Sending PCM to the transcription backend
161
+
2. Decoding to PCM (unless backend opts out)
162
+
3. Sending audio to the transcription backend
163
+
164
+
**Raw Audio Mode:** Backends can implement `wantsRawOpus(encoding?: AudioEncoding): boolean` to receive raw audio instead of decoded PCM. This is useful for backends like Deepgram that natively support Opus or Ogg-Opus formats.
165
+
166
+
**URL Parameter:** Clients can specify the audio encoding format via the `encoding` URL parameter:
167
+
-`encoding=opus` (default): Raw Opus frames at 48kHz
168
+
-`encoding=ogg-opus`: Containerized Ogg-Opus audio (e.g., from Voximplant)
0 commit comments