Skip to content

Releases: rapidaai/voice-ai

v2.3.0 - Bring Your Own LLM Infrastructure

13 May 05:20
Immutable release. Only release title and notes can be modified.
5d14f77

Choose a tag to compare

What's Changed

This release ships three major updates to Rapida.

Custom LLM

You can now bring your own LLM into Rapida with the new Custom LLM provider.

Supported API compatibility:

  • OpenAI Chat Completions (/v1/chat/completions)
  • OpenAI Responses (/v1/responses)
  • Anthropic Messages (/v1/messages)
  • Google Gemini (generateContent)
  • OpenAI Compatible (Ollama, vLLM, LM Studio, TGI)

You can point Rapida at your own base URL, pass optional headers, and keep the same workflow inside the product while changing the model layer underneath.

Related PR: #108

Ambient Audio

We also added ambient audio so calls do not feel silent or broken in production.

  • Adds background presence to live calls
  • Useful for receptionist, support, concierge, and outbound flows
  • Helps phone and web deployments feel more natural

Related PR: #113

Assistant Authentication

You can now authenticate a session before the agent starts.

  • Configure an HTTP authentication endpoint for inbound or outbound sessions
  • Pass headers, request body, timeout, and condition rules
  • Control fail behavior with Block or Do nothing
  • Verify sessions before initialization instead of letting every call go straight to the agent

This makes it easier to add your own policy, routing, verification, or access control step before Rapida starts the conversation.

Related PR: #116

Why This Matters

  • Bring your own model infrastructure into Rapida
  • Improve the live call experience without extra media plumbing
  • Add a verification layer before the assistant starts a session

Breaking Changes

None.

Upgrade Guide

Self-hosted

git pull origin main
docker compose up -d --build

Rapida Cloud

No action required.

New Contributors

Full Changelog: v2.2.0...v2.3.0

v2.2.0-beta — Inbound Voice AI Infrastructure

28 Apr 10:30
Immutable release. Only release title and notes can be modified.
d23874a

Choose a tag to compare

This release focuses on three infrastructure changes that make inbound voice AI production-ready.


Breaking change: transfer_to replaces to in transfer tool calls. Update your configs before upgrading.


Distributed SIP Registration

The registration flow is now based on explicit ownership and reconciliation:

  • Each DID is claimed through Redis-backed ownership
  • One instance becomes the active owner for that DID
  • The owner performs SIP registration and keeps it active
  • If that owner disappears, another instance can claim the DID and restore service

This is a key step toward making inbound voice AI reliable in multi-instance deployments.

Multi-Server Ownership and Failover

The registration pipeline now separates ownership, registration, and active-state handling more cleanly:

  • Clearer ownership semantics per DID
  • Safer failover across replicas
  • Fewer single-node assumptions
  • More predictable behavior in horizontally scaled deployments

Multi-Target SIP Transfer Failover

SIP transfer now supports multiple ordered transfer_to targets:

  • Try target 1
  • If target 1 fails or does not answer, try target 2
  • Continue until a target connects or the list is exhausted

Defined Post-Transfer Behavior

Transfer now has explicit post-transfer behavior for SIP:

  • end_call
  • resume_ai

Engineer-Facing Changelog

SIP Registration

  • Added distributed SIP registration manager
  • Added Redis-backed DID ownership flow
  • Added multi-server registration pipeline
  • Separated claim ownership, register, and mark-active stages
  • Improved reconcile behavior for active and removed registrations
  • Improved ownership release behavior for failover and cleanup

SIP Transfer

  • Added support for ordered multi-target transfer via transfer_to
  • Added per-target transfer retry/failover flow
  • Added explicit post-transfer behavior with end_call and resume_ai
  • Improved transfer handling in SIP pipeline and streamer path
  • Aligned transfer behavior for channels where multi-target transfer does not apply

Telephony and Session Behavior

  • Improved metadata alignment across telephony paths
  • Improved transfer-related session state handling
  • Improved observer/event behavior around transfer lifecycle
  • Reduced undefined behavior after operator hangup

Tooling and API Cleanup

  • Renamed transfer argument from to to transfer_to
  • Improved transfer tool call handling
  • Improved downstream transfer orchestration consistency

Stability and Tests

  • Added and updated tests around transfer behavior
  • Improved dispatch-related test coverage
  • Fixed call-flow edge cases around tool calls and injected messages
  • Refined dispatch behavior for transfer and tool-call flows

Full Changelog: v2.1.0...v2.2.0

v2.1.0 — Built-In Observability

01 Apr 05:59
Immutable release. Only release title and notes can be modified.
ac19987

Choose a tag to compare

Rapida v2.1.0 — Built-In Observability. Richer Than Most Managed Platforms.

Rapida is the only open-source voice AI platform where you self-host the entire stack, see per-stage latency on every call, swap any provider via config, and own your data completely.

No more external media servers. No more fragmented systems stitched together with glue code. Just engineering.


Per-Stage Telemetry

Every call now tracks granular, per-stage latency across the entire voice pipeline. No external tooling required.

  • STT latency — time from audio frame to transcript token
  • LLM time-to-first-token (TTFT) — inference latency per turn
  • TTS time-to-first-byte (TTFB) — synthesis latency per utterance
  • Duration metrics — end-to-end call stage durations with drill-down
  • Configurable telemetry providers — CRUD APIs to plug your own telemetry exporters per assistant
  • Dashboard visualization — all metrics visible in the Rapida UI, per call and aggregated

Measure your own pipeline. Identify bottlenecks. Optimize with data.


Pipeline Architecture Rewrite

The executor layer has been refactored into a streaming pipeline architecture.

  • LLM executor abstraction — clean separation between AgentKit, model-based, and WebSocket LLM backends
  • Executor-to-pipeline refactoring — the dispatch loop now routes through a unified pipeline instead of discrete executors
  • Pipeline optimization — reduced allocation overhead and improved streaming throughput
  • Input normalizer — structured input preprocessing before LLM inference

JSON-Driven Provider Configuration

Adding a new STT, TTS, or LLM provider no longer requires deep codebase knowledge.

  • Provider configs defined declaratively in JSON
  • Eliminates boilerplate when integrating new providers
  • Validated and tested with the existing provider matrix

Inline Noise Reduction

  • Integrated noise reduction into the audio input pipeline
  • Denoising runs inline before VAD, improving speech detection accuracy in noisy environments
  • New DenoiseAudioPacket and DenoisedAudioPacket packet types in the dispatch system

UX Overhaul

  • Simplified assistant creation — fewer steps, better defaults, streamlined flow
  • Model settings modal — configure LLM parameters without leaving the assistant view
  • Simplified deployment workflow — get to production faster
  • Agent workplace management — manage multiple agents from a single workspace
  • Analysis UX — updated create-analysis flow with better visualization
  • System variable suggestions — autocomplete for reserved prompt variables
  • Argument suggestions — inline suggestions for tool/function arguments

Bug Fixes

  • Google STT timeout handling
  • Credential dropdown in telemetry provider configuration
  • Knowledge tool only loads when the feature is enabled
  • Whitespace preservation after sentence boundaries for TTS
  • Missing VAD configuration parameters
  • Gemini LLM parameter mapping
  • First-time startup onboarding flow
  • Notification settings layout
  • Source indicator design alignment

Testing

  • 142 test files changed across backend and UI
  • Unit tests for all critical path components
  • Provider config test coverage
  • Language fallback tests for STT
  • Model pipeline integration tests

Developer Experience

Skills Framework

New skills for AI-assisted development on the Rapida codebase:

  • Provider integration (LLM, STT, TTS, telephony, VAD)
  • Telemetry integration
  • Noise reduction integration
  • End-of-speech integration
  • System understanding and local setup

Each skill includes validation scripts, templates, and examples.

Hook Orchestration

  • Pre/post-implementation hooks for automated test validation
  • Changed-file test runners
  • Post-tool hints for test coverage gaps

Breaking Changes

None. Backwards-compatible with v2.0.2.


Upgrade

# Self-hosted (Docker Compose)
git pull origin main
docker compose pull
docker compose up -d

# Fresh install
git clone https://github.com/rapidaai/voice-ai.git
cd voice-ai
cp .env.example .env
docker compose up -d

What's Next

  • Lower latency and higher concurrency in the agent runtime
  • Local model deployment for on-prem and air-gapped environments
  • Extended telemetry: custom dashboards, alerting, export to Datadog/Grafana
  • Improved documentation at doc.rapida.ai

Full Changelog: v2.0.2...v2.1.0

Star the repo: https://github.com/rapidaai/voice-ai
Docs: https://doc.rapida.ai

v2.0.2 — Smarter Listening, Better Testing

17 Mar 09:34
Immutable release. Only release title and notes can be modified.

Choose a tag to compare

v2.0.2 — Smarter Listening, Better Testing

Rapida now hears better and knows when to stop listening. This release introduces pluggable Voice Activity Detection and End-of-Speech engines, a comprehensive provider test suite, and key infrastructure upgrades.


Highlights

Pluggable VAD & End-of-Speech — Your agent now has ears that actually know when you're done talking.

Engine Type How it works
LiveKit EOS End-of-Speech ONNX-based turn detection with chat-aware inference
Pipecat EOS End-of-Speech Mel-spectrogram analysis for precise speech boundary detection
Silence-based EOS End-of-Speech Configurable silence threshold fallback
TEN VAD Voice Activity Lightweight real-time voice activity detection
FireRed VAD Voice Activity ONNX-based VAD with fbank feature extraction

All models are bundled and downloaded at build time — zero runtime fetching.

Audio Heartbeat — A new keepalive mechanism prevents premature end-of-speech triggers during natural pauses, making conversations feel more human.


Testing & Reliability

  • Full STT/TTS test coverage — Integration and unit tests across all providers: Google, Deepgram, ElevenLabs, Cartesia, AssemblyAI, Azure, Sarvam, Rime, Speechmatics
  • Google STT auto-reconnect — Automatically recovers from "Stream timed out" errors during long calls
  • Stream fixes for static packet dispatch and ElevenLabs TTS

Infrastructure

  • Go 1.25.8 across all services and Docker base images
  • CI pipeline updated for new Go version
  • Knowledge/telemetry enabled in dev config by default

Web Widget & Deployment

  • Idle timeout backoff configuration for web plugin deployments
  • Fixed ideal_timeoutidle_timeout typo across entities (migrations 000009, 000010)
  • Production deployment testing and fixes

UI Polish

  • Consistent card list design across all listing pages
  • Config form multi-input select fix
  • Datepicker styling alignment
  • Integration bridge updates for document-api
  • New VAD/EOS configuration panels with sensible defaults

SDKs & Examples

Updated SDKs (Python, React, React Widget) and examples (Go, Node.js, Python, React) to latest versions.


Community


Upgrade Guide

git pull origin main
docker compose down
docker compose up -d --build

Migrations 000009 and 000010 for assistant-api run automatically on startup.

Full diff: v2.0.1-pre...v2.0.2

v2.0.2-pre — VAD & End-of-Speech Engines, STT/TTS Test Suite, Go 1.25.8

17 Mar 09:31
Immutable release. Only release title and notes can be modified.

Choose a tag to compare

What's Changed in v2.0.2-pre

Voice Activity Detection (VAD) & End-of-Speech Engines

The voice pipeline now supports pluggable VAD and end-of-speech (EOS) detection, giving you fine-grained control over when the agent starts and stops listening.

New EOS Engines

  • LiveKit EOS — ONNX-based turn detection with custom tokenizer and chat template inference (livekit/turn_detector.go)
  • Pipecat EOS — Mel-spectrogram-based end-of-speech detection with platform-specific ONNX inference (pipecat/mel_spectrogram.go)
  • Silence-based EOS — Configurable silence threshold fallback (silence_based/silence_based_end_of_speech.go)

New VAD Providers

  • TEN VAD — lightweight voice activity detector
  • FireRed VAD — ONNX-based VAD with fbank feature extraction and postprocessor

All VAD/EOS ONNX models are now bundled in the repo and downloaded at Docker build time — no runtime model fetching required.

  • 48df33c0 1ef73aec 03332e79 41980364 f9c53e5a b047e755

Audio Heartbeat

Added an audio heartbeat mechanism to keep the speech pipeline active and optimize end-of-speech trigger timing, preventing premature cutoffs.

  • 03332e79 feat: audio heartbeat to optimize end of speech trigger

UI Configuration

New UI panels to configure VAD provider settings (FireRed, Silero, TEN) and EOS provider settings (LiveKit EOS) with sensible defaults.

  • 31c2d51d 31538388

Comprehensive STT/TTS Test Suite

Added integration and unit tests across all STT, TTS, and integration service providers: Google, Deepgram, ElevenLabs, Cartesia, AssemblyAI, Azure, Sarvam, Rime, Speechmatics. Includes shared test utilities for audio fixtures, credential loading, and metric collection.

  • 0d96809c feat: added integration and unit test for all the stt, tts and integration service
  • 3328f404 (from v2.0.1-pre) testing and refactoring stt and tts integration

Google STT Auto-Reconnect

Google STT streams now automatically reconnect when hitting the "Stream timed out after receiving no more client requests" error, preventing silent STT failures during long calls.

  • ca9e1b8d feat: reconnect google stt for stream timeout

Infrastructure & Build

Go 1.25.8

Bumped Go across all services and base Docker images.

  • 949288ad 3b591ec0

CI

Updated CI workflow to align with new Go version and enabled knowledge/telemetry in dev config.

  • 3b591ec0 chore: bump Go to 1.25.8, fix formatting, and enable knowledge/telemetry in dev

Web Widget & Deployment

  • Added idle timeout backoff configuration on web plugin deployments (migration 000009)
  • Fixed typo: renamed ideal_timeoutidle_timeout across entities (migration 000010)
  • Web widget deployment production testing and fixes
  • a7b9707a 095b9400

UI Improvements

  • Card list design made consistent across all listing pages (assistants, knowledge base, integrations, credentials)
  • Config form multi-input select component fix
  • Datepicker styling fixes (flatpickr CSS alignment)
  • Integration bridge updated for document-api
  • 81983940 b42ef01b 2ef61448 4df96dc1

SDKs & Examples

Updated SDKs (Python, React, React Widget) and examples (Go, Node.js, Python, React) to latest versions.

  • bd7152a6 feat: updated sdks and examples

Bug Fixes

  • 55cb24b4 fix: stream fixes for static packet (ElevenLabs TTS, dispatch behavior)
  • 46d1e541 fix: gofmt formatting across all callers and transformers
  • 095b9400 refactor: typo fix on deployment entity, cleanup web-widget unused vars

Community

  • Added Discord and Cal.com booking badges to README
  • 936160f5

Upgrade Guide

Self-hosted:

git pull origin main
docker compose down
docker compose up -d --build

Note: This release includes database migrations 000009 and 000010 for assistant-api. They will run automatically on startup.

Rapida Cloud: No action required — already deployed.


Full diff: v2.0.1-pre...v2.0.2-pre

v2.0.1-pre — Redesigned Dashboard, New Voice Engine, Delta Packets & External Telemetry

09 Mar 03:24
Immutable release. Only release title and notes can be modified.

Choose a tag to compare

New Features

Rime TTS Integration

Added Rime as the 15th TTS provider. Configure via tts.provider: rime in your assistant config.

  • 69f453f5 feat: added rime implementation

External Telemetry & Metrics

Push call telemetry and performance metrics to your own observability stack (Prometheus, Datadog, etc.).

  • fe5899f8 feat: metrics and telemetry
  • 48a542ff feat: pushing telemetry and metrics to external system

Docker Profiles

Deploy with or without the Knowledge Base module. OpenSearch is now fully optional.

# Without Knowledge Base
docker compose up -d

# With Knowledge Base
docker compose --profile knowledge up -d
  • a691152f feat: add docker profiles for with/without knowledge base deployment
  • b89cc01b feat: auto-configure env per profile using compose override file
  • d27ce98d fix: make OpenSearch config safely optional for non-knowledge deployments
  • 89b37c60 feat: removing knowledge for local deployment
  • 57614f7f feat: optional dependencies as document-api

Delta Packet Dispatching

New delta packet type for more efficient real-time audio transmission over the priority-based dispatcher.

  • ebf090ff feat: added delta packet
  • fa5ce201 feat: fixes for packet dispatching

Consistent LLM Streaming

Unified streaming behavior across all 11+ LLM providers. No more provider-specific quirks in the voice pipeline.

  • eff2494b feat: consistent streaming behaviour from all the llm

AgentKit Improvements

Streamlined AgentKit implementation with improved test coverage.

  • 6443646c feat: streamline agentkit implementation
  • 8a9eab05 feat: added test for agentkit and model
  • b23b7dd7 feat: added change for agentkit test and ui fixes

Debugger Updates

Richer metrics in debugger UI beyond charts. WebTalk now supports the debugger.

  • 911ea238 feat: update ui component for debugger
  • a7752724 fix: design for debugger and telemetry to show more metrics than chart
  • 8b321a55 feat: aligned webtalk to support debugger

UI Changes

IBM Carbon Design System Migration

Full migration of the dashboard UI to IBM Carbon Design System v11. Affects all pages — assistant config, debugger, telemetry, and core workflows.

  • 14a578c7 feat: migration to IBM Carbon design pattern
  • 86a0d6bd feat: refactor design to IBM carbon design philosophy
  • 01ccb972 feat: refactor design to IBM carbon design philosophy
  • 5df065e1 feat: added change to align with IBM carbon design

Performance

Docker Build Optimization

Switched to rapidaai/rapida-* base images. Removed unnecessary exposed ports. Pinned linux/amd64 for consistent local builds.

  • a42e98e6 feat: docker build optimization with rapidaai/rapida-* base images
  • 908fe0f7 feat: optimizing build time
  • 7074f973 feat: optimizing build time
  • 9a332f36 fix: simplifying building process
  • 40cd927c fix: pin linux/amd64 platform for local builds and workflow pushes
  • cd2def0a feat: removed exposing ports which is not required

Audio Pipeline

Simplified audio/text stream switching. Updated default resampler. Consistent 60ms duration threshold across all input.

  • 2a0c507d feat: improving audio/text switch
  • 4beeacaa feat: simplified switching from text stream to audio stream
  • c24bf5a9 feat: change in default resampler
  • 418cd7fd fix: added consistent duration and threshold 60ms for all the input

Bug Fixes

Security

  • 5105a99e fix: Vulnerability #1: GO-2026-4337

Recording

  • cd640a36 fix: sync recording as close to user listening
  • 3d223998 fix: sync recording as close to user listening
  • 088cf103 fix: audio recorder pacing for tts
  • 1a3ffbce fix: serving recording from local storage

Voice Pipeline

  • d332939a feat: handling conversation error at end client
  • 6bcb3c41 feat: timeout fixes after the complete audio is played
  • 80fed27e fix: updated the callback for packet

STT/TTS

  • 3328f404 feat: testing and refactoring stt and tts integration
  • 9b74bdb5 feat: added few more stt and tts
  • 69bb1ee0 feat: test fixes for silero and model

Infrastructure

  • c6a77e16 feat: opensearch docker fix
  • e4440f37 fix: minor fix in nginx and proto updated
  • 28b20e89 fix: increase time to support IE, safari
  • 1f144d87 feat: updated dependencies for document api

Documentation

  • d03f4a53 feat: add platform architecture diagram to README and docs
  • c776f104 feat: added architecture design
  • 4da07be0 feat: added docs reference
  • 5631cdce ref: update submodule doc reference

Upgrade Guide

Self-hosted:

git pull origin main
docker compose up -d --build

If you were previously running with Knowledge Base and want to use the new profiles:

# Stop existing
docker compose down

# Start with explicit profile
docker compose --profile knowledge up -d --build

Rapida Cloud: No action required — already deployed.


Full diff: v2.0.0...main

v2.0.0 — Telephony Reliability, SIP, WebRTC & Asterisk

24 Feb 04:13
Immutable release. Only release title and notes can be modified.

Choose a tag to compare

What's Changed

Telephony: Rebuilt from the Ground Up

  • Unified channel architecture shared across Twilio, Vonage, Exotel, Asterisk, and SIP
  • Interruptions, end-of-call signals, and transfer/hangup events handled consistently across all providers
  • New `call_contexts` table persists call state — async provider callbacks resolve correctly even after call ends
  • Channel UUIDs propagate end-to-end for reliable transfer and hangup operations

New: SIP Integration

Full native SIP stack with RTP handling, SDP negotiation, port allocator, and session management.

New: Asterisk / AudioSocket

Native integration with Asterisk via AudioSocket and WebSocket. Inbound and outbound call flows tested.

New: WebRTC Channel

Browser-based voice with Opus codec support and gRPC signalling, sharing the same hardened base as telephony.

Audio Pipeline: Deterministic Framing

  • Exact 20 ms output frames with zero per-frame heap allocations
  • Atomic interruption — `ClearOutputBuffer` drains buffers and signals output writer instantly
  • Per-speaker recordings split into `assistant_recording_url` + `user_recording_url`

LLM Text Aggregator

Sentence-boundary aggregation between LLM stream and TTS — reduces first-word latency with configurable delimiters and clean context-switch flush.

Test Coverage

  • 31 `BaseStreamer` unit tests
  • Full telephony provider test suite (Twilio, Vonage, Exotel)
  • Transformer tests for AssemblyAI, Azure, Cartesia, Deepgram, ElevenLabs, Google, Resemble, Sarvam
  • LLM text aggregator: 972 lines of unit tests + 381 lines of benchmarks

Bug Fixes

  • Google TTS stale response fix for outputs > 5 sentences
  • AgentKit executor stability fixes
  • First-token response time now tracked in LLM telemetry
  • MCP tool support for agent tool invocations

Breaking Changes / Migrations

Migration Change
`000005` New `call_contexts` table required
`000006` `recording_url` split into `assistant_recording_url` + `user_recording_url`

Rapida v0.1.3

26 Jan 04:17
Immutable release. Only release title and notes can be modified.

Choose a tag to compare

New Features

Model Context Protocol (MCP) & Remote Agent Execution

  • WebSocket-Based LLM Executor — Enable real-time, low-latency communication with language models via WebSocket integration for streaming responses
  • Remote Executor and AgentKit (gRPC) — Run agents and models remotely with improved deployment flexibility and scalability

MCP Tool Implementation

  • New tools added to expand integration capabilities with external services and APIs

Improvements

Frontend & Dependency Updates

  • React Dependency Upgrades — Updated to the latest React dependencies for improved security and performance
  • Cleaner Logging — Removal of unnecessary logs for a more focused development experience
  • ESLint Fixes — Addressed outstanding lint errors to maintain codebase hygiene

CI/CD and Quality-of-Life Enhancements

  • Optimized Build Pipeline — CI updated to skip CGO-dependent packages and make Trivy scans non-blocking for faster, more reliable builds
  • Go Linting Improvements — Comprehensive auto-formatting and convention enforcement using golangci-lint, standardized to Go 1.25 in Docker and CI
  • Dependency Security — Packages updated and audit processes improved for enhanced security postures

Stability & Refactoring

  • Multiple under-the-hood improvements to enhance reliability and maintainability

Upgrade Considerations

  • No breaking changes — Applications using existing features remain fully compatible
  • Validation recommended — Applications utilizing new LLM execution paths or remote deployment features should be tested
  • Reinstall dependencies — Developers should update dependencies with npm install and go mod download

Rapida v0.1.2

19 Jan 05:40
Immutable release. Only release title and notes can be modified.

Choose a tag to compare

For Product Managers

New Features & Capabilities

  • Session Management Controls - Max Session Duration, Idle Timeout, Timeout Message, Timeout Backoff
  • Provider-Specific SSML Normalizers - Intelligent text normalizers per TTS provider for natural-sounding voice output across Azure, Google, and other providers
  • Google STT Model Validation - All Google Speech-to-Text models tested with optimized default confidence threshold of 0.5
  • Improved Turn Detection - Optimized conversation turn detection for natural human-AI voice interactions

New Provider Support

  • Sarvam AI - Text-to-speech and speech-to-text (Indian language specialist)
  • AssemblyAI - Speech-to-text provider with comprehensive language support
  • Cartesia - Speech-to-text model support
  • Azure Foundry & Vertex AI - Expanded text model options for LLM interactions

Telephony Improvements

  • Unified Call Handling - Merged inbound and outbound call logic (Exotel)
  • Intelligent Timeout Backoff - Better call experience with adaptive timeouts

Dashboard & UI

  • V3 Dashboard - New experience with telephony visibility and STT validation
  • Sentence Tokenizer for Debugger - Enhanced conversation analysis
  • UI Message Sequencing - Improved message flow visualization

For Developers

New Features

  • maxSessionDuration - Maximum allowed duration for a conversation session (in seconds). Enforces hard limit on conversation length to manage resources and costs
  • idealTimeout - Idle timeout duration (in seconds). If no user input is detected within this period, the system prompts the user
  • idealTimeoutMessage - Custom message displayed/spoken when idle timeout is triggered (e.g., "Are you still there?")
  • idealTimeoutBackoff - Backoff interval (in seconds) after showing the timeout message before taking further action. Provides a grace period for user response

Backend Changes (Go)

  • Model Executor - Fixed race conditions in concurrent execution
  • Tool System - Refactored tool call creation, editing, and result handling
  • End of Speech Detection - New system with configurable providers
  • Config Validation - Added comprehensive config tests

Frontend Changes (React/TypeScript)

  • Tool Components - Unified components with shared hooks and types
  • Provider Configs - New JSON configs for STT/TTS models
  • Sidebar Context - New context for sidebar state management

Performance Enhancements

  • Text Conversations - No longer initializes audio transformer (performance improvement)
  • Multi-Message UI - Fixed alignment for multiple messages per ID

Dependencies & Security

  • React SDK submodule updated
  • Node packages updated (yarn.lock)
  • Dependabot security patches applied
  • Added CodeQL analysis
  • Fixed OAuth2 authentication flows

Summary

This release introduces comprehensive session management controls, expands provider support with new TTS/STT integrations, improves telephony handling, and delivers significant backend optimizations and performance improvements.

Rapida v0.1.1

07 Jan 07:25
Immutable release. Only release title and notes can be modified.

Choose a tag to compare

New Integrations
Telephony

  • Exotel integration
  • Inbound and outbound call support
  • Streaming audio pipeline wired into Rapida orchestrator

Call lifecycle events mapped cleanly to agent state
Speech-to-Text

  • Sarvam STT integration
  • Streaming transcription support
  • Partial and final transcript handling

Improved latency consistency under load

  • Bug Fixes & Stability Improvements
  • Fixed audio stream desync issues during long-running calls
  • Resolved intermittent end-of-speech detection edge cases
  • Improved error handling when STT or TTS streams restart
  • Fixed state leaks on abrupt call termination
  • Reduced noisy logs during high-frequency audio ingestion

Reliability & Internal Improvements

  • Safer handling of external provider timeouts
  • Better retries and backoff for integration failures
  • Clearer failure signals surfaced to the orchestrator
  • Minor performance optimizations in streaming pipeline

Full Changelog: v1.0.0...v0.1.1