13 May 05:20

iamprashant

Immutable

v2.3.0

5d14f77

v2.3.0 - Bring Your Own LLM Infrastructure Latest

Latest

What's Changed

This release ships three major updates to Rapida.

Custom LLM

You can now bring your own LLM into Rapida with the new Custom LLM provider.

Supported API compatibility:

OpenAI Chat Completions (/v1/chat/completions)
OpenAI Responses (/v1/responses)
Anthropic Messages (/v1/messages)
Google Gemini (generateContent)
OpenAI Compatible (Ollama, vLLM, LM Studio, TGI)

You can point Rapida at your own base URL, pass optional headers, and keep the same workflow inside the product while changing the model layer underneath.

Related PR: #108

Ambient Audio

We also added ambient audio so calls do not feel silent or broken in production.

Adds background presence to live calls
Useful for receptionist, support, concierge, and outbound flows
Helps phone and web deployments feel more natural

Related PR: #113

Assistant Authentication

You can now authenticate a session before the agent starts.

Configure an HTTP authentication endpoint for inbound or outbound sessions
Pass headers, request body, timeout, and condition rules
Control fail behavior with Block or Do nothing
Verify sessions before initialization instead of letting every call go straight to the agent

This makes it easier to add your own policy, routing, verification, or access control step before Rapida starts the conversation.

Related PR: #116

Why This Matters

Bring your own model infrastructure into Rapida
Improve the live call experience without extra media plumbing
Add a verification layer before the assistant starts a session

Breaking Changes

None.

Upgrade Guide

Self-hosted

git pull origin main
docker compose up -d --build

Rapida Cloud

No action required.

New Contributors

@eschmidbauer made their first contribution in #108

Full Changelog: v2.2.0...v2.3.0

Contributors

eschmidbauer

Assets 3

28 Apr 10:30

iamprashant

Immutable

v2.2.0

d23874a

v2.2.0-beta — Inbound Voice AI Infrastructure

This release focuses on three infrastructure changes that make inbound voice AI production-ready.

Breaking change: transfer_to replaces to in transfer tool calls. Update your configs before upgrading.

Distributed SIP Registration

The registration flow is now based on explicit ownership and reconciliation:

Each DID is claimed through Redis-backed ownership
One instance becomes the active owner for that DID
The owner performs SIP registration and keeps it active
If that owner disappears, another instance can claim the DID and restore service

This is a key step toward making inbound voice AI reliable in multi-instance deployments.

Multi-Server Ownership and Failover

The registration pipeline now separates ownership, registration, and active-state handling more cleanly:

Clearer ownership semantics per DID
Safer failover across replicas
Fewer single-node assumptions
More predictable behavior in horizontally scaled deployments

Multi-Target SIP Transfer Failover

SIP transfer now supports multiple ordered transfer_to targets:

Try target 1
If target 1 fails or does not answer, try target 2
Continue until a target connects or the list is exhausted

Defined Post-Transfer Behavior

Transfer now has explicit post-transfer behavior for SIP:

end_call
resume_ai

Engineer-Facing Changelog

SIP Registration

Added distributed SIP registration manager
Added Redis-backed DID ownership flow
Added multi-server registration pipeline
Separated claim ownership, register, and mark-active stages
Improved reconcile behavior for active and removed registrations
Improved ownership release behavior for failover and cleanup

SIP Transfer

Added support for ordered multi-target transfer via transfer_to
Added per-target transfer retry/failover flow
Added explicit post-transfer behavior with end_call and resume_ai
Improved transfer handling in SIP pipeline and streamer path
Aligned transfer behavior for channels where multi-target transfer does not apply

Telephony and Session Behavior

Improved metadata alignment across telephony paths
Improved transfer-related session state handling
Improved observer/event behavior around transfer lifecycle
Reduced undefined behavior after operator hangup

Tooling and API Cleanup

Renamed transfer argument from to to transfer_to
Improved transfer tool call handling
Improved downstream transfer orchestration consistency

Stability and Tests

Added and updated tests around transfer behavior
Improved dispatch-related test coverage
Fixed call-flow edge cases around tool calls and injected messages
Refined dispatch behavior for transfer and tool-call flows

Full Changelog: v2.1.0...v2.2.0

Assets 3

01 Apr 05:59

iamprashant

Immutable

v2.1.0

ac19987

v2.1.0 — Built-In Observability

Rapida v2.1.0 — Built-In Observability. Richer Than Most Managed Platforms.

Rapida is the only open-source voice AI platform where you self-host the entire stack, see per-stage latency on every call, swap any provider via config, and own your data completely.

No more external media servers. No more fragmented systems stitched together with glue code. Just engineering.

Per-Stage Telemetry

Every call now tracks granular, per-stage latency across the entire voice pipeline. No external tooling required.

STT latency — time from audio frame to transcript token
LLM time-to-first-token (TTFT) — inference latency per turn
TTS time-to-first-byte (TTFB) — synthesis latency per utterance
Duration metrics — end-to-end call stage durations with drill-down
Configurable telemetry providers — CRUD APIs to plug your own telemetry exporters per assistant
Dashboard visualization — all metrics visible in the Rapida UI, per call and aggregated

Measure your own pipeline. Identify bottlenecks. Optimize with data.

Pipeline Architecture Rewrite

The executor layer has been refactored into a streaming pipeline architecture.

LLM executor abstraction — clean separation between AgentKit, model-based, and WebSocket LLM backends
Executor-to-pipeline refactoring — the dispatch loop now routes through a unified pipeline instead of discrete executors
Pipeline optimization — reduced allocation overhead and improved streaming throughput
Input normalizer — structured input preprocessing before LLM inference

JSON-Driven Provider Configuration

Adding a new STT, TTS, or LLM provider no longer requires deep codebase knowledge.

Provider configs defined declaratively in JSON
Eliminates boilerplate when integrating new providers
Validated and tested with the existing provider matrix

Inline Noise Reduction

Integrated noise reduction into the audio input pipeline
Denoising runs inline before VAD, improving speech detection accuracy in noisy environments
New DenoiseAudioPacket and DenoisedAudioPacket packet types in the dispatch system

UX Overhaul

Simplified assistant creation — fewer steps, better defaults, streamlined flow
Model settings modal — configure LLM parameters without leaving the assistant view
Simplified deployment workflow — get to production faster
Agent workplace management — manage multiple agents from a single workspace
Analysis UX — updated create-analysis flow with better visualization
System variable suggestions — autocomplete for reserved prompt variables
Argument suggestions — inline suggestions for tool/function arguments

Bug Fixes

Google STT timeout handling
Credential dropdown in telemetry provider configuration
Knowledge tool only loads when the feature is enabled
Whitespace preservation after sentence boundaries for TTS
Missing VAD configuration parameters
Gemini LLM parameter mapping
First-time startup onboarding flow
Notification settings layout
Source indicator design alignment

Testing

142 test files changed across backend and UI
Unit tests for all critical path components
Provider config test coverage
Language fallback tests for STT
Model pipeline integration tests

Developer Experience

Skills Framework

New skills for AI-assisted development on the Rapida codebase:

Provider integration (LLM, STT, TTS, telephony, VAD)
Telemetry integration
Noise reduction integration
End-of-speech integration
System understanding and local setup

Each skill includes validation scripts, templates, and examples.

Hook Orchestration

Pre/post-implementation hooks for automated test validation
Changed-file test runners
Post-tool hints for test coverage gaps

Breaking Changes

None. Backwards-compatible with v2.0.2.

Upgrade

# Self-hosted (Docker Compose)
git pull origin main
docker compose pull
docker compose up -d

# Fresh install
git clone https://github.com/rapidaai/voice-ai.git
cd voice-ai
cp .env.example .env
docker compose up -d

What's Next

Lower latency and higher concurrency in the agent runtime
Local model deployment for on-prem and air-gapped environments
Extended telemetry: custom dashboards, alerting, export to Datadog/Grafana
Improved documentation at doc.rapida.ai

Full Changelog: v2.0.2...v2.1.0

Star the repo: https://github.com/rapidaai/voice-ai
Docs: https://doc.rapida.ai

Assets 3

17 Mar 09:34

iamprashant

Immutable

v2.0.2

46d1e54

v2.0.2 — Smarter Listening, Better Testing

Rapida now hears better and knows when to stop listening. This release introduces pluggable Voice Activity Detection and End-of-Speech engines, a comprehensive provider test suite, and key infrastructure upgrades.

Highlights

Pluggable VAD & End-of-Speech — Your agent now has ears that actually know when you're done talking.

Engine	Type	How it works
LiveKit EOS	End-of-Speech	ONNX-based turn detection with chat-aware inference
Pipecat EOS	End-of-Speech	Mel-spectrogram analysis for precise speech boundary detection
Silence-based EOS	End-of-Speech	Configurable silence threshold fallback
TEN VAD	Voice Activity	Lightweight real-time voice activity detection
FireRed VAD	Voice Activity	ONNX-based VAD with fbank feature extraction

All models are bundled and downloaded at build time — zero runtime fetching.

Audio Heartbeat — A new keepalive mechanism prevents premature end-of-speech triggers during natural pauses, making conversations feel more human.

Testing & Reliability

Full STT/TTS test coverage — Integration and unit tests across all providers: Google, Deepgram, ElevenLabs, Cartesia, AssemblyAI, Azure, Sarvam, Rime, Speechmatics
Google STT auto-reconnect — Automatically recovers from "Stream timed out" errors during long calls
Stream fixes for static packet dispatch and ElevenLabs TTS

Infrastructure

Go 1.25.8 across all services and Docker base images
CI pipeline updated for new Go version
Knowledge/telemetry enabled in dev config by default

Web Widget & Deployment

Idle timeout backoff configuration for web plugin deployments
Fixed ideal_timeout → idle_timeout typo across entities (migrations 000009, 000010)
Production deployment testing and fixes

UI Polish

Consistent card list design across all listing pages
Config form multi-input select fix
Datepicker styling alignment
Integration bridge updates for document-api
New VAD/EOS configuration panels with sensible defaults

SDKs & Examples

Updated SDKs (Python, React, React Widget) and examples (Go, Node.js, Python, React) to latest versions.

Community

Join us on Discord
Book a meeting with the team

Upgrade Guide

git pull origin main
docker compose down
docker compose up -d --build

Migrations 000009 and 000010 for assistant-api run automatically on startup.

Full diff: v2.0.1-pre...v2.0.2

Assets 3

17 Mar 09:31

iamprashant

Immutable

v2.0.2-pre

46d1e54

v2.0.2-pre — VAD & End-of-Speech Engines, STT/TTS Test Suite, Go 1.25.8 Pre-release

Pre-release

What's Changed in v2.0.2-pre

Voice Activity Detection (VAD) & End-of-Speech Engines

The voice pipeline now supports pluggable VAD and end-of-speech (EOS) detection, giving you fine-grained control over when the agent starts and stops listening.

New EOS Engines

LiveKit EOS — ONNX-based turn detection with custom tokenizer and chat template inference (livekit/turn_detector.go)
Pipecat EOS — Mel-spectrogram-based end-of-speech detection with platform-specific ONNX inference (pipecat/mel_spectrogram.go)
Silence-based EOS — Configurable silence threshold fallback (silence_based/silence_based_end_of_speech.go)

New VAD Providers

TEN VAD — lightweight voice activity detector
FireRed VAD — ONNX-based VAD with fbank feature extraction and postprocessor

All VAD/EOS ONNX models are now bundled in the repo and downloaded at Docker build time — no runtime model fetching required.

48df33c0 1ef73aec 03332e79 41980364 f9c53e5a b047e755

Audio Heartbeat

Added an audio heartbeat mechanism to keep the speech pipeline active and optimize end-of-speech trigger timing, preventing premature cutoffs.

03332e79 feat: audio heartbeat to optimize end of speech trigger

UI Configuration

New UI panels to configure VAD provider settings (FireRed, Silero, TEN) and EOS provider settings (LiveKit EOS) with sensible defaults.

31c2d51d 31538388

Comprehensive STT/TTS Test Suite

Added integration and unit tests across all STT, TTS, and integration service providers: Google, Deepgram, ElevenLabs, Cartesia, AssemblyAI, Azure, Sarvam, Rime, Speechmatics. Includes shared test utilities for audio fixtures, credential loading, and metric collection.

0d96809c feat: added integration and unit test for all the stt, tts and integration service
3328f404 (from v2.0.1-pre) testing and refactoring stt and tts integration

Google STT Auto-Reconnect

Google STT streams now automatically reconnect when hitting the "Stream timed out after receiving no more client requests" error, preventing silent STT failures during long calls.

ca9e1b8d feat: reconnect google stt for stream timeout

Infrastructure & Build

Go 1.25.8

Bumped Go across all services and base Docker images.

949288ad 3b591ec0

CI

Updated CI workflow to align with new Go version and enabled knowledge/telemetry in dev config.

3b591ec0 chore: bump Go to 1.25.8, fix formatting, and enable knowledge/telemetry in dev

Web Widget & Deployment

Added idle timeout backoff configuration on web plugin deployments (migration 000009)
Fixed typo: renamed ideal_timeout → idle_timeout across entities (migration 000010)
Web widget deployment production testing and fixes
a7b9707a 095b9400

UI Improvements

Card list design made consistent across all listing pages (assistants, knowledge base, integrations, credentials)
Config form multi-input select component fix
Datepicker styling fixes (flatpickr CSS alignment)
Integration bridge updated for document-api
81983940 b42ef01b 2ef61448 4df96dc1

SDKs & Examples

Updated SDKs (Python, React, React Widget) and examples (Go, Node.js, Python, React) to latest versions.

bd7152a6 feat: updated sdks and examples

Bug Fixes

55cb24b4 fix: stream fixes for static packet (ElevenLabs TTS, dispatch behavior)
46d1e541 fix: gofmt formatting across all callers and transformers
095b9400 refactor: typo fix on deployment entity, cleanup web-widget unused vars

Community

Added Discord and Cal.com booking badges to README
936160f5

Upgrade Guide

Self-hosted:

git pull origin main
docker compose down
docker compose up -d --build

Note: This release includes database migrations 000009 and 000010 for assistant-api. They will run automatically on startup.

Rapida Cloud: No action required — already deployed.

Full diff: v2.0.1-pre...v2.0.2-pre

Assets 3

09 Mar 03:24

iamprashant

Immutable

v2.0.1-pre

b23b7dd

v2.0.1-pre — Redesigned Dashboard, New Voice Engine, Delta Packets & External Telemetry Pre-release

Pre-release

New Features

Rime TTS Integration

Added Rime as the 15th TTS provider. Configure via tts.provider: rime in your assistant config.

69f453f5 feat: added rime implementation

External Telemetry & Metrics

Push call telemetry and performance metrics to your own observability stack (Prometheus, Datadog, etc.).

fe5899f8 feat: metrics and telemetry
48a542ff feat: pushing telemetry and metrics to external system

Docker Profiles

Deploy with or without the Knowledge Base module. OpenSearch is now fully optional.

# Without Knowledge Base
docker compose up -d

# With Knowledge Base
docker compose --profile knowledge up -d

a691152f feat: add docker profiles for with/without knowledge base deployment
b89cc01b feat: auto-configure env per profile using compose override file
d27ce98d fix: make OpenSearch config safely optional for non-knowledge deployments
89b37c60 feat: removing knowledge for local deployment
57614f7f feat: optional dependencies as document-api

Delta Packet Dispatching

New delta packet type for more efficient real-time audio transmission over the priority-based dispatcher.

ebf090ff feat: added delta packet
fa5ce201 feat: fixes for packet dispatching

Consistent LLM Streaming

Unified streaming behavior across all 11+ LLM providers. No more provider-specific quirks in the voice pipeline.

eff2494b feat: consistent streaming behaviour from all the llm

AgentKit Improvements

Streamlined AgentKit implementation with improved test coverage.

6443646c feat: streamline agentkit implementation
8a9eab05 feat: added test for agentkit and model
b23b7dd7 feat: added change for agentkit test and ui fixes

Debugger Updates

Richer metrics in debugger UI beyond charts. WebTalk now supports the debugger.

911ea238 feat: update ui component for debugger
a7752724 fix: design for debugger and telemetry to show more metrics than chart
8b321a55 feat: aligned webtalk to support debugger

UI Changes

IBM Carbon Design System Migration

Full migration of the dashboard UI to IBM Carbon Design System v11. Affects all pages — assistant config, debugger, telemetry, and core workflows.

14a578c7 feat: migration to IBM Carbon design pattern
86a0d6bd feat: refactor design to IBM carbon design philosophy
01ccb972 feat: refactor design to IBM carbon design philosophy
5df065e1 feat: added change to align with IBM carbon design

Performance

Docker Build Optimization

Switched to rapidaai/rapida-* base images. Removed unnecessary exposed ports. Pinned linux/amd64 for consistent local builds.

a42e98e6 feat: docker build optimization with rapidaai/rapida-* base images
908fe0f7 feat: optimizing build time
7074f973 feat: optimizing build time
9a332f36 fix: simplifying building process
40cd927c fix: pin linux/amd64 platform for local builds and workflow pushes
cd2def0a feat: removed exposing ports which is not required

Audio Pipeline

Simplified audio/text stream switching. Updated default resampler. Consistent 60ms duration threshold across all input.

2a0c507d feat: improving audio/text switch
4beeacaa feat: simplified switching from text stream to audio stream
c24bf5a9 feat: change in default resampler
418cd7fd fix: added consistent duration and threshold 60ms for all the input

Bug Fixes

Security

5105a99e fix: Vulnerability #1: GO-2026-4337

Recording

cd640a36 fix: sync recording as close to user listening
3d223998 fix: sync recording as close to user listening
088cf103 fix: audio recorder pacing for tts
1a3ffbce fix: serving recording from local storage

Voice Pipeline

d332939a feat: handling conversation error at end client
6bcb3c41 feat: timeout fixes after the complete audio is played
80fed27e fix: updated the callback for packet

STT/TTS

3328f404 feat: testing and refactoring stt and tts integration
9b74bdb5 feat: added few more stt and tts
69bb1ee0 feat: test fixes for silero and model

Infrastructure

c6a77e16 feat: opensearch docker fix
e4440f37 fix: minor fix in nginx and proto updated
28b20e89 fix: increase time to support IE, safari
1f144d87 feat: updated dependencies for document api

Documentation

d03f4a53 feat: add platform architecture diagram to README and docs
c776f104 feat: added architecture design
4da07be0 feat: added docs reference
5631cdce ref: update submodule doc reference

Upgrade Guide

Self-hosted:

git pull origin main
docker compose up -d --build

If you were previously running with Knowledge Base and want to use the new profiles:

# Stop existing
docker compose down

# Start with explicit profile
docker compose --profile knowledge up -d --build

Rapida Cloud: No action required — already deployed.

Full diff: v2.0.0...main

Assets 3

24 Feb 04:13

iamprashant

Immutable

v2.0.0

b68a1bf

v2.0.0 — Telephony Reliability, SIP, WebRTC & Asterisk

What's Changed

Telephony: Rebuilt from the Ground Up

Unified channel architecture shared across Twilio, Vonage, Exotel, Asterisk, and SIP
Interruptions, end-of-call signals, and transfer/hangup events handled consistently across all providers
New `call_contexts` table persists call state — async provider callbacks resolve correctly even after call ends
Channel UUIDs propagate end-to-end for reliable transfer and hangup operations

New: SIP Integration

Full native SIP stack with RTP handling, SDP negotiation, port allocator, and session management.

New: Asterisk / AudioSocket

Native integration with Asterisk via AudioSocket and WebSocket. Inbound and outbound call flows tested.

New: WebRTC Channel

Browser-based voice with Opus codec support and gRPC signalling, sharing the same hardened base as telephony.

Audio Pipeline: Deterministic Framing

Exact 20 ms output frames with zero per-frame heap allocations
Atomic interruption — `ClearOutputBuffer` drains buffers and signals output writer instantly
Per-speaker recordings split into `assistant_recording_url` + `user_recording_url`

LLM Text Aggregator

Sentence-boundary aggregation between LLM stream and TTS — reduces first-word latency with configurable delimiters and clean context-switch flush.

Test Coverage

31 `BaseStreamer` unit tests
Full telephony provider test suite (Twilio, Vonage, Exotel)
Transformer tests for AssemblyAI, Azure, Cartesia, Deepgram, ElevenLabs, Google, Resemble, Sarvam
LLM text aggregator: 972 lines of unit tests + 381 lines of benchmarks

Bug Fixes

Google TTS stale response fix for outputs > 5 sentences
AgentKit executor stability fixes
First-token response time now tracked in LLM telemetry
MCP tool support for agent tool invocations

Breaking Changes / Migrations

Migration	Change
`000005`	New `call_contexts` table required
`000006`	`recording_url` split into `assistant_recording_url` + `user_recording_url`

Assets 3

26 Jan 04:17

iamprashant

Immutable

v0.1.3

51d1cc4

Rapida v0.1.3

New Features

Model Context Protocol (MCP) & Remote Agent Execution

WebSocket-Based LLM Executor — Enable real-time, low-latency communication with language models via WebSocket integration for streaming responses
Remote Executor and AgentKit (gRPC) — Run agents and models remotely with improved deployment flexibility and scalability

MCP Tool Implementation

New tools added to expand integration capabilities with external services and APIs

Improvements

Frontend & Dependency Updates

React Dependency Upgrades — Updated to the latest React dependencies for improved security and performance
Cleaner Logging — Removal of unnecessary logs for a more focused development experience
ESLint Fixes — Addressed outstanding lint errors to maintain codebase hygiene

CI/CD and Quality-of-Life Enhancements

Optimized Build Pipeline — CI updated to skip CGO-dependent packages and make Trivy scans non-blocking for faster, more reliable builds
Go Linting Improvements — Comprehensive auto-formatting and convention enforcement using golangci-lint, standardized to Go 1.25 in Docker and CI
Dependency Security — Packages updated and audit processes improved for enhanced security postures

Stability & Refactoring

Multiple under-the-hood improvements to enhance reliability and maintainability

Upgrade Considerations

No breaking changes — Applications using existing features remain fully compatible
Validation recommended — Applications utilizing new LLM execution paths or remote deployment features should be tested
Reinstall dependencies — Developers should update dependencies with npm install and go mod download

Assets 3

19 Jan 05:40

iamprashant

Immutable

v0.1.2

f8a9e9d

Rapida v0.1.2

For Product Managers

New Features & Capabilities

Session Management Controls - Max Session Duration, Idle Timeout, Timeout Message, Timeout Backoff
Provider-Specific SSML Normalizers - Intelligent text normalizers per TTS provider for natural-sounding voice output across Azure, Google, and other providers
Google STT Model Validation - All Google Speech-to-Text models tested with optimized default confidence threshold of 0.5
Improved Turn Detection - Optimized conversation turn detection for natural human-AI voice interactions

New Provider Support

Sarvam AI - Text-to-speech and speech-to-text (Indian language specialist)
AssemblyAI - Speech-to-text provider with comprehensive language support
Cartesia - Speech-to-text model support
Azure Foundry & Vertex AI - Expanded text model options for LLM interactions

Telephony Improvements

Unified Call Handling - Merged inbound and outbound call logic (Exotel)
Intelligent Timeout Backoff - Better call experience with adaptive timeouts

Dashboard & UI

V3 Dashboard - New experience with telephony visibility and STT validation
Sentence Tokenizer for Debugger - Enhanced conversation analysis
UI Message Sequencing - Improved message flow visualization

For Developers

New Features

maxSessionDuration - Maximum allowed duration for a conversation session (in seconds). Enforces hard limit on conversation length to manage resources and costs
idealTimeout - Idle timeout duration (in seconds). If no user input is detected within this period, the system prompts the user
idealTimeoutMessage - Custom message displayed/spoken when idle timeout is triggered (e.g., "Are you still there?")
idealTimeoutBackoff - Backoff interval (in seconds) after showing the timeout message before taking further action. Provides a grace period for user response

Backend Changes (Go)

Model Executor - Fixed race conditions in concurrent execution
Tool System - Refactored tool call creation, editing, and result handling
End of Speech Detection - New system with configurable providers
Config Validation - Added comprehensive config tests

Frontend Changes (React/TypeScript)

Tool Components - Unified components with shared hooks and types
Provider Configs - New JSON configs for STT/TTS models
Sidebar Context - New context for sidebar state management

Performance Enhancements

Text Conversations - No longer initializes audio transformer (performance improvement)
Multi-Message UI - Fixed alignment for multiple messages per ID

Dependencies & Security

React SDK submodule updated
Node packages updated (yarn.lock)
Dependabot security patches applied
Added CodeQL analysis
Fixed OAuth2 authentication flows

Summary

This release introduces comprehensive session management controls, expands provider support with new TTS/STT integrations, improves telephony handling, and delivers significant backend optimizations and performance improvements.

Assets 3

07 Jan 07:25

iamprashant

Immutable

v0.1.1

e783d95

Rapida v0.1.1

New Integrations
Telephony

Exotel integration
Inbound and outbound call support
Streaming audio pipeline wired into Rapida orchestrator

Call lifecycle events mapped cleanly to agent state
Speech-to-Text

Sarvam STT integration
Streaming transcription support
Partial and final transcript handling

Improved latency consistency under load

Bug Fixes & Stability Improvements
Fixed audio stream desync issues during long-running calls
Resolved intermittent end-of-speech detection edge cases
Improved error handling when STT or TTS streams restart
Fixed state leaks on abrupt call termination
Reduced noisy logs during high-frequency audio ingestion

Reliability & Internal Improvements

Safer handling of external provider timeouts
Better retries and backoff for integration failures
Clearer failure signals surfaced to the orchestrator
Minor performance optimizations in streaming pipeline

Full Changelog: v1.0.0...v0.1.1

Assets 3

Uh oh!

Releases: rapidaai/voice-ai

v2.3.0 - Bring Your Own LLM Infrastructure

What's Changed

Custom LLM

Ambient Audio

Assistant Authentication

Why This Matters

Breaking Changes

Upgrade Guide

Self-hosted

Rapida Cloud

New Contributors

Contributors

Uh oh!

v2.2.0-beta — Inbound Voice AI Infrastructure

Distributed SIP Registration

Multi-Server Ownership and Failover

Multi-Target SIP Transfer Failover

Defined Post-Transfer Behavior

Engineer-Facing Changelog

SIP Registration

SIP Transfer

Telephony and Session Behavior

Tooling and API Cleanup

Stability and Tests

Uh oh!

v2.1.0 — Built-In Observability

Rapida v2.1.0 — Built-In Observability. Richer Than Most Managed Platforms.

Per-Stage Telemetry

Pipeline Architecture Rewrite

JSON-Driven Provider Configuration

Inline Noise Reduction

UX Overhaul

Bug Fixes

Testing

Developer Experience

Skills Framework

Hook Orchestration

Breaking Changes

Upgrade

What's Next

Uh oh!

v2.0.2 — Smarter Listening, Better Testing

v2.0.2 — Smarter Listening, Better Testing

Highlights

Testing & Reliability

Infrastructure

Web Widget & Deployment

UI Polish

SDKs & Examples

Community

Upgrade Guide

Uh oh!

v2.0.2-pre — VAD & End-of-Speech Engines, STT/TTS Test Suite, Go 1.25.8

What's Changed in v2.0.2-pre

Voice Activity Detection (VAD) & End-of-Speech Engines

New EOS Engines

New VAD Providers

Audio Heartbeat

UI Configuration

Comprehensive STT/TTS Test Suite

Google STT Auto-Reconnect

Infrastructure & Build

Go 1.25.8

CI

Web Widget & Deployment

UI Improvements

SDKs & Examples

Bug Fixes

Community

Upgrade Guide

Uh oh!

v2.0.1-pre — Redesigned Dashboard, New Voice Engine, Delta Packets & External Telemetry

New Features

Rime TTS Integration

External Telemetry & Metrics

Docker Profiles

Delta Packet Dispatching

Consistent LLM Streaming