A Node.js SDK for handling Plivo real-time media streaming over WebSocket. Built on top of ws WebSocketServer.
npm install plivo-stream-sdk-node
# or
bun install plivo-stream-sdk-nodeimport express from 'express';
import PlivoWebSocketServer from 'plivo-stream-sdk-node';
import type { StartEvent, MediaEvent, DTMFEvent } from 'plivo-stream-sdk-node';
import * as Plivo from 'plivo';
const app = express();
const PORT = 8000;
// Plivo webhook endpoint - returns XML to initiate streaming
app.get('/stream', (req, res) => {
const streamUrl = `wss://${req.get('host')}/stream`;
const plivoResponse = new (Plivo as any).Response();
plivoResponse.addSpeak('Hello world!');
const params = {
contentType: 'audio/x-mulaw;rate=8000',
keepCallAlive: true,
bidirectional: true,
};
plivoResponse.addStream(streamUrl, params);
res.header('Content-Type', 'application/xml');
res.header('Content-Length', plivoResponse.toString().length.toString());
res.header('Connection', 'keep-alive');
res.header('Keep-Alive', 'timeout=60');
const xml = plivoResponse.toXML();
res.type('application/xml');
res.send(xml);
});
// Start HTTP server
const server = app.listen(PORT, () => {
console.log(`Server listening on http://localhost:${PORT}`);
});
// Create PlivoWebSocketServer attached to your HTTP server
const plivoServer = new PlivoWebSocketServer({ server, path: '/stream' });
plivoServer
.onConnection(async (ws, req) => {
console.log('New WebSocket connection');
// Initialize per-connection resources here (e.g., speech-to-text clients)
})
.onStart((event: StartEvent, ws) => {
console.log('Stream started:', event.start.streamId);
console.log('Call ID:', event.start.callId);
console.log('Media format:', event.start.mediaFormat);
})
.onMedia((event: MediaEvent, ws) => {
// Get raw audio buffer from the event
const audioBuffer = event.getRawMedia();
// Process audio (e.g., send to speech-to-text service)
})
.onDtmf((event: DTMFEvent, ws) => {
console.log('DTMF digit:', event.dtmf.digit);
// Example: clear audio queue on * press
if (event.dtmf.digit === '*') {
plivoServer.clearAudio(ws);
}
})
.onPlayedStream((event) => {
console.log('Stream played:', event.name);
})
.onClearedAudio((event) => {
console.log('Audio cleared:', event.streamId);
})
.onError((error, ws) => {
console.error('Stream error:', error.message);
})
.onClose((ws) => {
console.log('Connection closed');
})
.start(); // Must call .start() to begin accepting connectionsExtends WebSocketServer from the ws package.
new PlivoWebSocketServer(options: ServerOptions, callback?: () => void)Standard ws ServerOptions. Common options:
server: HTTP/HTTPS server to attach topath: URL path for WebSocket connections (e.g.,'/stream')port: Port to listen on (if not attaching to existing server)
Start accepting WebSocket connections. Must be called after registering all event handlers.
Close the WebSocket server.
All return this for chaining. Multiple handlers can be registered per event.
| Method | Callback Signature | Description |
|---|---|---|
onConnection |
(ws, request) => void | Promise<void> |
New connection established. Async callbacks are awaited before processing messages. |
onStart |
(event: StartEvent, ws) => void |
Stream initialization with call metadata |
onMedia |
(event: MediaEvent, ws) => void |
Incoming audio chunk |
onDtmf |
(event: DTMFEvent, ws) => void |
DTMF digit received |
onPlayedStream |
(event: PlayedStreamEvent, ws) => void |
Audio playback confirmation |
onClearedAudio |
(event: ClearedAudioEvent, ws) => void |
Audio queue cleared confirmation |
onError |
(error: Error, ws) => void |
Error occurred |
onClose |
(ws) => void |
Connection closed |
Send audio to a specific connection.
// payload can be Buffer, Uint8Array, or ArrayBuffer
plivoServer.playAudio(ws, 'audio/x-mulaw', 8000, audioBuffer);Send a checkpoint event to track audio playback progress.
plivoServer.checkpoint(ws, 'greeting-complete');Clear all queued audio for a connection.
plivoServer.clearAudio(ws);| Method | Return Type | Description |
|---|---|---|
getStreamId(ws) |
string | undefined |
Stream ID for the connection |
getAccountId(ws) |
string | undefined |
Plivo account ID |
getCallId(ws) |
string | undefined |
Call ID |
getHeaders(ws) |
string | undefined |
Extra headers from start event |
isActive(ws) |
boolean |
Whether connection is open |
{
event: 'start';
sequenceNumber: number;
start: {
callId: string; // UUID
streamId: string; // UUID
accountId: string;
tracks: string[];
mediaFormat: {
encoding: string;
sampleRate: number;
};
};
extra_headers: string;
}{
event: 'media';
sequenceNumber: number;
streamId: string;
media: {
track: string;
timestamp: string;
chunk: number;
payload: string; // base64 encoded audio
};
extra_headers: string;
getRawMedia(): Buffer; // Helper to decode payload
}{
event: 'dtmf';
sequenceNumber: number;
streamId: string;
dtmf: {
track: string;
digit: string;
timestamp: string;
}
extra_headers: string;
}{
event: 'playedStream';
sequenceNumber: number;
streamId: string;
name: string;
}{
event: 'clearedAudio';
sequenceNumber: number;
streamId: string;
}import type {
StartEvent,
MediaEvent,
DTMFEvent,
PlayedStreamEvent,
ClearedAudioEvent,
PlayAudioEvent,
CheckpointEvent,
ClearAudioEvent,
IncomingEventEnum,
OutgoingEventEnum,
} from 'plivo-stream-sdk-node';The examples/express-streaming directory contains a complete voice AI example using:
- Deepgram - Real-time speech-to-text
- OpenAI - Chat completion for responses
- ElevenLabs - Text-to-speech
- Node.js 18+ or Bun
- A Plivo account with streaming enabled
- API keys for Deepgram, OpenAI, and ElevenLabs
- A way to expose your local server (e.g., ngrok)
- Navigate to the example directory:
cd examples/express-streaming- Install dependencies:
npm install
# or
bun install- Create a
.envfile:
PORT=8000
# Deepgram (https://console.deepgram.com)
DEEPGRAM_API_KEY=your_deepgram_api_key
DEEPGRAM_MODEL=nova-2
# OpenAI (https://platform.openai.com)
OPENAI_API_KEY=your_openai_api_key
OPENAI_MODEL=gpt-4o-mini
# ElevenLabs (https://elevenlabs.io)
ELEVENLABS_API_KEY=your_elevenlabs_api_key
ELEVENLABS_VOICE_ID=your_voice_id
ELEVENLABS_MODEL_ID=eleven_turbo_v2- Start the server:
npx ts-node server.ts
# or with bun
bun run server.ts- Expose your server using ngrok:
ngrok http 8000-
Configure Plivo:
- Go to your Plivo console
- Set your application's Answer URL to:
https://your-ngrok-url.ngrok.io/stream
-
Make a call to your Plivo number and start talking!
- When a call comes in, Plivo hits the
/streamendpoint - The XML response initiates a bidirectional WebSocket stream
- Audio from the caller is sent to Deepgram for transcription
- Transcriptions are sent to OpenAI for a response
- OpenAI's response is converted to speech via ElevenLabs
- The audio is streamed back to the caller in real-time
- Press
*to clear the audio queue (interrupt the AI)
MIT