Agent Voice Response - Ultravox Speech-to-Speech Integration

This repository showcases the integration between Agent Voice Response and Ultravox's Real-time Speech-to-Speech API. The application leverages Ultravox's powerful language model to process audio input from users, providing intelligent, context-aware responses in real-time audio format.

Prerequisites

To set up and run this project, you will need:

Node.js and npm installed
An Ultravox API key with access to the real-time API
WebSocket support in your environment

Setup

1. Clone the Repository

git clone https://github.com/agentvoiceresponse/avr-sts-ultravox.git
cd avr-sts-ultravox

2. Install Dependencies

npm install

3. Configure Environment Variables

Create a .env file in the root of the project to store your API keys and configuration. You will need to add the following variables:

ULTRAVOX_API_KEY=your_ultravox_api_key
ULTRAVOX_AGENT_ID=your_ultravox_agent_id
PORT=6031

Replace your_ultravox_api_key with your actual Ultravox API key.

4. Running the Application

Start the application by running the following command:

node index.js

The server will start on the port defined in the environment variable (default: 6030).

How It Works

The Agent Voice Response system integrates with Ultravox's Real-time Speech-to-Speech API to provide intelligent audio-based responses to user queries. The server receives audio input from users, forwards it to Ultravox's API, and then returns the model's response as audio in real-time using WebSocket communication.

Key Components

Express.js Server: Handles incoming audio streams from clients
WebSocket Communication: Manages real-time communication with Ultravox's API
Audio Processing: Handles audio format conversion between 8kHz and 24kHz
Real-time Streaming: Processes and streams audio data in real-time

Audio Processing

The application includes two main audio processing functions:

Upsampling (8kHz to 48kHz):
- Converts client audio from 8kHz to 48kHz using linear interpolation
- Required for Ultravox's API which expects 48kHz input
Downsampling (24kHz to 8kHz):
- Converts Ultravox's 48kHz output back to 8kHz
- Ensures compatibility with client audio systems (Asterisk AudioSocket Module)

API Endpoints

POST `/speech-to-speech-stream`

This endpoint accepts an audio stream and returns a streamed audio response generated by Ultravox.

Customizing the Application

Environment Variables

You can customize the application behavior using the following environment variables:

ULTRAVOX_API_KEY: Your Ultravox API key (required)
ULTRAVOX_AGENT_ID: Your Ultravox Agent ID (required)
PORT: The port on which the server will listen (default: 6031)

Error Handling

The application includes comprehensive error handling for:

WebSocket connection issues
Audio processing errors
Ultravox API errors
Stream processing errors

All errors are logged to the console and appropriate error messages are returned to the client.

Support & Community

GitHub: https://github.com/agentvoiceresponse - Report issues, contribute code.
Discord: https://discord.gg/DFTU69Hg74 - Join the community discussion.
Docker Hub: https://hub.docker.com/u/agentvoiceresponse - Find Docker images.
Wiki: https://wiki.agentvoiceresponse.com/en/home - Project documentation and guides.

Support AVR

AVR is free and open-source. If you find it valuable, consider supporting its development:

License

MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
README.md		README.md
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Agent Voice Response - Ultravox Speech-to-Speech Integration

Prerequisites

Setup

1. Clone the Repository

2. Install Dependencies

3. Configure Environment Variables

4. Running the Application

How It Works

Key Components

Audio Processing

API Endpoints

POST `/speech-to-speech-stream`

Customizing the Application

Environment Variables

Error Handling

Support & Community

Support AVR

License

About

Uh oh!

Releases

Packages

Languages

License

operativeit/avr-sts-ultravox

Folders and files

Latest commit

History

Repository files navigation

Agent Voice Response - Ultravox Speech-to-Speech Integration

Prerequisites

Setup

1. Clone the Repository

2. Install Dependencies

3. Configure Environment Variables

4. Running the Application

How It Works

Key Components

Audio Processing

API Endpoints

POST /speech-to-speech-stream

Customizing the Application

Environment Variables

Error Handling

Support & Community

Support AVR

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

POST `/speech-to-speech-stream`

Packages