Releases: rapidaai/voice-ai
v2.3.0 - Bring Your Own LLM Infrastructure
What's Changed
This release ships three major updates to Rapida.
Custom LLM
You can now bring your own LLM into Rapida with the new Custom LLM provider.
Supported API compatibility:
- OpenAI Chat Completions (
/v1/chat/completions) - OpenAI Responses (
/v1/responses) - Anthropic Messages (
/v1/messages) - Google Gemini (
generateContent) - OpenAI Compatible (
Ollama,vLLM,LM Studio,TGI)
You can point Rapida at your own base URL, pass optional headers, and keep the same workflow inside the product while changing the model layer underneath.
Related PR: #108
Ambient Audio
We also added ambient audio so calls do not feel silent or broken in production.
- Adds background presence to live calls
- Useful for receptionist, support, concierge, and outbound flows
- Helps phone and web deployments feel more natural
Related PR: #113
Assistant Authentication
You can now authenticate a session before the agent starts.
- Configure an HTTP authentication endpoint for inbound or outbound sessions
- Pass headers, request body, timeout, and condition rules
- Control fail behavior with
BlockorDo nothing - Verify sessions before initialization instead of letting every call go straight to the agent
This makes it easier to add your own policy, routing, verification, or access control step before Rapida starts the conversation.
Related PR: #116
Why This Matters
- Bring your own model infrastructure into Rapida
- Improve the live call experience without extra media plumbing
- Add a verification layer before the assistant starts a session
Breaking Changes
None.
Upgrade Guide
Self-hosted
git pull origin main
docker compose up -d --buildRapida Cloud
No action required.
New Contributors
- @eschmidbauer made their first contribution in #108
Full Changelog: v2.2.0...v2.3.0
v2.2.0-beta — Inbound Voice AI Infrastructure
This release focuses on three infrastructure changes that make inbound voice AI production-ready.
Breaking change:
transfer_toreplacestoin transfer tool calls. Update your configs before upgrading.
Distributed SIP Registration
The registration flow is now based on explicit ownership and reconciliation:
- Each DID is claimed through Redis-backed ownership
- One instance becomes the active owner for that DID
- The owner performs SIP registration and keeps it active
- If that owner disappears, another instance can claim the DID and restore service
This is a key step toward making inbound voice AI reliable in multi-instance deployments.
Multi-Server Ownership and Failover
The registration pipeline now separates ownership, registration, and active-state handling more cleanly:
- Clearer ownership semantics per DID
- Safer failover across replicas
- Fewer single-node assumptions
- More predictable behavior in horizontally scaled deployments
Multi-Target SIP Transfer Failover
SIP transfer now supports multiple ordered transfer_to targets:
- Try target 1
- If target 1 fails or does not answer, try target 2
- Continue until a target connects or the list is exhausted
Defined Post-Transfer Behavior
Transfer now has explicit post-transfer behavior for SIP:
end_callresume_ai
Engineer-Facing Changelog
SIP Registration
- Added distributed SIP registration manager
- Added Redis-backed DID ownership flow
- Added multi-server registration pipeline
- Separated claim ownership, register, and mark-active stages
- Improved reconcile behavior for active and removed registrations
- Improved ownership release behavior for failover and cleanup
SIP Transfer
- Added support for ordered multi-target transfer via
transfer_to - Added per-target transfer retry/failover flow
- Added explicit post-transfer behavior with
end_callandresume_ai - Improved transfer handling in SIP pipeline and streamer path
- Aligned transfer behavior for channels where multi-target transfer does not apply
Telephony and Session Behavior
- Improved metadata alignment across telephony paths
- Improved transfer-related session state handling
- Improved observer/event behavior around transfer lifecycle
- Reduced undefined behavior after operator hangup
Tooling and API Cleanup
- Renamed transfer argument from
tototransfer_to - Improved transfer tool call handling
- Improved downstream transfer orchestration consistency
Stability and Tests
- Added and updated tests around transfer behavior
- Improved dispatch-related test coverage
- Fixed call-flow edge cases around tool calls and injected messages
- Refined dispatch behavior for transfer and tool-call flows
Full Changelog: v2.1.0...v2.2.0
v2.1.0 — Built-In Observability
Rapida v2.1.0 — Built-In Observability. Richer Than Most Managed Platforms.
Rapida is the only open-source voice AI platform where you self-host the entire stack, see per-stage latency on every call, swap any provider via config, and own your data completely.
No more external media servers. No more fragmented systems stitched together with glue code. Just engineering.
Per-Stage Telemetry
Every call now tracks granular, per-stage latency across the entire voice pipeline. No external tooling required.
- STT latency — time from audio frame to transcript token
- LLM time-to-first-token (TTFT) — inference latency per turn
- TTS time-to-first-byte (TTFB) — synthesis latency per utterance
- Duration metrics — end-to-end call stage durations with drill-down
- Configurable telemetry providers — CRUD APIs to plug your own telemetry exporters per assistant
- Dashboard visualization — all metrics visible in the Rapida UI, per call and aggregated
Measure your own pipeline. Identify bottlenecks. Optimize with data.
Pipeline Architecture Rewrite
The executor layer has been refactored into a streaming pipeline architecture.
- LLM executor abstraction — clean separation between AgentKit, model-based, and WebSocket LLM backends
- Executor-to-pipeline refactoring — the dispatch loop now routes through a unified pipeline instead of discrete executors
- Pipeline optimization — reduced allocation overhead and improved streaming throughput
- Input normalizer — structured input preprocessing before LLM inference
JSON-Driven Provider Configuration
Adding a new STT, TTS, or LLM provider no longer requires deep codebase knowledge.
- Provider configs defined declaratively in JSON
- Eliminates boilerplate when integrating new providers
- Validated and tested with the existing provider matrix
Inline Noise Reduction
- Integrated noise reduction into the audio input pipeline
- Denoising runs inline before VAD, improving speech detection accuracy in noisy environments
- New
DenoiseAudioPacketandDenoisedAudioPacketpacket types in the dispatch system
UX Overhaul
- Simplified assistant creation — fewer steps, better defaults, streamlined flow
- Model settings modal — configure LLM parameters without leaving the assistant view
- Simplified deployment workflow — get to production faster
- Agent workplace management — manage multiple agents from a single workspace
- Analysis UX — updated create-analysis flow with better visualization
- System variable suggestions — autocomplete for reserved prompt variables
- Argument suggestions — inline suggestions for tool/function arguments
Bug Fixes
- Google STT timeout handling
- Credential dropdown in telemetry provider configuration
- Knowledge tool only loads when the feature is enabled
- Whitespace preservation after sentence boundaries for TTS
- Missing VAD configuration parameters
- Gemini LLM parameter mapping
- First-time startup onboarding flow
- Notification settings layout
- Source indicator design alignment
Testing
- 142 test files changed across backend and UI
- Unit tests for all critical path components
- Provider config test coverage
- Language fallback tests for STT
- Model pipeline integration tests
Developer Experience
Skills Framework
New skills for AI-assisted development on the Rapida codebase:
- Provider integration (LLM, STT, TTS, telephony, VAD)
- Telemetry integration
- Noise reduction integration
- End-of-speech integration
- System understanding and local setup
Each skill includes validation scripts, templates, and examples.
Hook Orchestration
- Pre/post-implementation hooks for automated test validation
- Changed-file test runners
- Post-tool hints for test coverage gaps
Breaking Changes
None. Backwards-compatible with v2.0.2.
Upgrade
# Self-hosted (Docker Compose)
git pull origin main
docker compose pull
docker compose up -d
# Fresh install
git clone https://github.com/rapidaai/voice-ai.git
cd voice-ai
cp .env.example .env
docker compose up -dWhat's Next
- Lower latency and higher concurrency in the agent runtime
- Local model deployment for on-prem and air-gapped environments
- Extended telemetry: custom dashboards, alerting, export to Datadog/Grafana
- Improved documentation at doc.rapida.ai
Full Changelog: v2.0.2...v2.1.0
Star the repo: https://github.com/rapidaai/voice-ai
Docs: https://doc.rapida.ai
v2.0.2 — Smarter Listening, Better Testing
v2.0.2 — Smarter Listening, Better Testing
Rapida now hears better and knows when to stop listening. This release introduces pluggable Voice Activity Detection and End-of-Speech engines, a comprehensive provider test suite, and key infrastructure upgrades.
Highlights
Pluggable VAD & End-of-Speech — Your agent now has ears that actually know when you're done talking.
| Engine | Type | How it works |
|---|---|---|
| LiveKit EOS | End-of-Speech | ONNX-based turn detection with chat-aware inference |
| Pipecat EOS | End-of-Speech | Mel-spectrogram analysis for precise speech boundary detection |
| Silence-based EOS | End-of-Speech | Configurable silence threshold fallback |
| TEN VAD | Voice Activity | Lightweight real-time voice activity detection |
| FireRed VAD | Voice Activity | ONNX-based VAD with fbank feature extraction |
All models are bundled and downloaded at build time — zero runtime fetching.
Audio Heartbeat — A new keepalive mechanism prevents premature end-of-speech triggers during natural pauses, making conversations feel more human.
Testing & Reliability
- Full STT/TTS test coverage — Integration and unit tests across all providers: Google, Deepgram, ElevenLabs, Cartesia, AssemblyAI, Azure, Sarvam, Rime, Speechmatics
- Google STT auto-reconnect — Automatically recovers from "Stream timed out" errors during long calls
- Stream fixes for static packet dispatch and ElevenLabs TTS
Infrastructure
- Go 1.25.8 across all services and Docker base images
- CI pipeline updated for new Go version
- Knowledge/telemetry enabled in dev config by default
Web Widget & Deployment
- Idle timeout backoff configuration for web plugin deployments
- Fixed
ideal_timeout→idle_timeouttypo across entities (migrations000009,000010) - Production deployment testing and fixes
UI Polish
- Consistent card list design across all listing pages
- Config form multi-input select fix
- Datepicker styling alignment
- Integration bridge updates for document-api
- New VAD/EOS configuration panels with sensible defaults
SDKs & Examples
Updated SDKs (Python, React, React Widget) and examples (Go, Node.js, Python, React) to latest versions.
Community
- Join us on Discord
- Book a meeting with the team
Upgrade Guide
git pull origin main
docker compose down
docker compose up -d --buildMigrations
000009and000010for assistant-api run automatically on startup.
Full diff: v2.0.1-pre...v2.0.2
v2.0.2-pre — VAD & End-of-Speech Engines, STT/TTS Test Suite, Go 1.25.8
What's Changed in v2.0.2-pre
Voice Activity Detection (VAD) & End-of-Speech Engines
The voice pipeline now supports pluggable VAD and end-of-speech (EOS) detection, giving you fine-grained control over when the agent starts and stops listening.
New EOS Engines
- LiveKit EOS — ONNX-based turn detection with custom tokenizer and chat template inference (
livekit/turn_detector.go) - Pipecat EOS — Mel-spectrogram-based end-of-speech detection with platform-specific ONNX inference (
pipecat/mel_spectrogram.go) - Silence-based EOS — Configurable silence threshold fallback (
silence_based/silence_based_end_of_speech.go)
New VAD Providers
- TEN VAD — lightweight voice activity detector
- FireRed VAD — ONNX-based VAD with fbank feature extraction and postprocessor
All VAD/EOS ONNX models are now bundled in the repo and downloaded at Docker build time — no runtime model fetching required.
48df33c01ef73aec03332e7941980364f9c53e5ab047e755
Audio Heartbeat
Added an audio heartbeat mechanism to keep the speech pipeline active and optimize end-of-speech trigger timing, preventing premature cutoffs.
03332e79feat: audio heartbeat to optimize end of speech trigger
UI Configuration
New UI panels to configure VAD provider settings (FireRed, Silero, TEN) and EOS provider settings (LiveKit EOS) with sensible defaults.
31c2d51d31538388
Comprehensive STT/TTS Test Suite
Added integration and unit tests across all STT, TTS, and integration service providers: Google, Deepgram, ElevenLabs, Cartesia, AssemblyAI, Azure, Sarvam, Rime, Speechmatics. Includes shared test utilities for audio fixtures, credential loading, and metric collection.
0d96809cfeat: added integration and unit test for all the stt, tts and integration service3328f404(from v2.0.1-pre) testing and refactoring stt and tts integration
Google STT Auto-Reconnect
Google STT streams now automatically reconnect when hitting the "Stream timed out after receiving no more client requests" error, preventing silent STT failures during long calls.
ca9e1b8dfeat: reconnect google stt for stream timeout
Infrastructure & Build
Go 1.25.8
Bumped Go across all services and base Docker images.
949288ad3b591ec0
CI
Updated CI workflow to align with new Go version and enabled knowledge/telemetry in dev config.
3b591ec0chore: bump Go to 1.25.8, fix formatting, and enable knowledge/telemetry in dev
Web Widget & Deployment
- Added idle timeout backoff configuration on web plugin deployments (migration
000009) - Fixed typo: renamed
ideal_timeout→idle_timeoutacross entities (migration000010) - Web widget deployment production testing and fixes
a7b9707a095b9400
UI Improvements
- Card list design made consistent across all listing pages (assistants, knowledge base, integrations, credentials)
- Config form multi-input select component fix
- Datepicker styling fixes (flatpickr CSS alignment)
- Integration bridge updated for document-api
81983940b42ef01b2ef614484df96dc1
SDKs & Examples
Updated SDKs (Python, React, React Widget) and examples (Go, Node.js, Python, React) to latest versions.
bd7152a6feat: updated sdks and examples
Bug Fixes
55cb24b4fix: stream fixes for static packet (ElevenLabs TTS, dispatch behavior)46d1e541fix: gofmt formatting across all callers and transformers095b9400refactor: typo fix on deployment entity, cleanup web-widget unused vars
Community
- Added Discord and Cal.com booking badges to README
936160f5
Upgrade Guide
Self-hosted:
git pull origin main
docker compose down
docker compose up -d --buildNote: This release includes database migrations
000009and000010for assistant-api. They will run automatically on startup.
Rapida Cloud: No action required — already deployed.
Full diff: v2.0.1-pre...v2.0.2-pre
v2.0.1-pre — Redesigned Dashboard, New Voice Engine, Delta Packets & External Telemetry
New Features
Rime TTS Integration
Added Rime as the 15th TTS provider. Configure via tts.provider: rime in your assistant config.
69f453f5feat: added rime implementation
External Telemetry & Metrics
Push call telemetry and performance metrics to your own observability stack (Prometheus, Datadog, etc.).
fe5899f8feat: metrics and telemetry48a542fffeat: pushing telemetry and metrics to external system
Docker Profiles
Deploy with or without the Knowledge Base module. OpenSearch is now fully optional.
# Without Knowledge Base
docker compose up -d
# With Knowledge Base
docker compose --profile knowledge up -da691152ffeat: add docker profiles for with/without knowledge base deploymentb89cc01bfeat: auto-configure env per profile using compose override filed27ce98dfix: make OpenSearch config safely optional for non-knowledge deployments89b37c60feat: removing knowledge for local deployment57614f7ffeat: optional dependencies as document-api
Delta Packet Dispatching
New delta packet type for more efficient real-time audio transmission over the priority-based dispatcher.
ebf090fffeat: added delta packetfa5ce201feat: fixes for packet dispatching
Consistent LLM Streaming
Unified streaming behavior across all 11+ LLM providers. No more provider-specific quirks in the voice pipeline.
eff2494bfeat: consistent streaming behaviour from all the llm
AgentKit Improvements
Streamlined AgentKit implementation with improved test coverage.
6443646cfeat: streamline agentkit implementation8a9eab05feat: added test for agentkit and modelb23b7dd7feat: added change for agentkit test and ui fixes
Debugger Updates
Richer metrics in debugger UI beyond charts. WebTalk now supports the debugger.
911ea238feat: update ui component for debuggera7752724fix: design for debugger and telemetry to show more metrics than chart8b321a55feat: aligned webtalk to support debugger
UI Changes
IBM Carbon Design System Migration
Full migration of the dashboard UI to IBM Carbon Design System v11. Affects all pages — assistant config, debugger, telemetry, and core workflows.
14a578c7feat: migration to IBM Carbon design pattern86a0d6bdfeat: refactor design to IBM carbon design philosophy01ccb972feat: refactor design to IBM carbon design philosophy5df065e1feat: added change to align with IBM carbon design
Performance
Docker Build Optimization
Switched to rapidaai/rapida-* base images. Removed unnecessary exposed ports. Pinned linux/amd64 for consistent local builds.
a42e98e6feat: docker build optimization with rapidaai/rapida-* base images908fe0f7feat: optimizing build time7074f973feat: optimizing build time9a332f36fix: simplifying building process40cd927cfix: pin linux/amd64 platform for local builds and workflow pushescd2def0afeat: removed exposing ports which is not required
Audio Pipeline
Simplified audio/text stream switching. Updated default resampler. Consistent 60ms duration threshold across all input.
2a0c507dfeat: improving audio/text switch4beeacaafeat: simplified switching from text stream to audio streamc24bf5a9feat: change in default resampler418cd7fdfix: added consistent duration and threshold 60ms for all the input
Bug Fixes
Security
5105a99efix: Vulnerability #1: GO-2026-4337
Recording
cd640a36fix: sync recording as close to user listening3d223998fix: sync recording as close to user listening088cf103fix: audio recorder pacing for tts1a3ffbcefix: serving recording from local storage
Voice Pipeline
d332939afeat: handling conversation error at end client6bcb3c41feat: timeout fixes after the complete audio is played80fed27efix: updated the callback for packet
STT/TTS
3328f404feat: testing and refactoring stt and tts integration9b74bdb5feat: added few more stt and tts69bb1ee0feat: test fixes for silero and model
Infrastructure
c6a77e16feat: opensearch docker fixe4440f37fix: minor fix in nginx and proto updated28b20e89fix: increase time to support IE, safari1f144d87feat: updated dependencies for document api
Documentation
d03f4a53feat: add platform architecture diagram to README and docsc776f104feat: added architecture design4da07be0feat: added docs reference5631cdceref: update submodule doc reference
Upgrade Guide
Self-hosted:
git pull origin main
docker compose up -d --buildIf you were previously running with Knowledge Base and want to use the new profiles:
# Stop existing
docker compose down
# Start with explicit profile
docker compose --profile knowledge up -d --buildRapida Cloud: No action required — already deployed.
Full diff: v2.0.0...main
v2.0.0 — Telephony Reliability, SIP, WebRTC & Asterisk
What's Changed
Telephony: Rebuilt from the Ground Up
- Unified channel architecture shared across Twilio, Vonage, Exotel, Asterisk, and SIP
- Interruptions, end-of-call signals, and transfer/hangup events handled consistently across all providers
- New `call_contexts` table persists call state — async provider callbacks resolve correctly even after call ends
- Channel UUIDs propagate end-to-end for reliable transfer and hangup operations
New: SIP Integration
Full native SIP stack with RTP handling, SDP negotiation, port allocator, and session management.
New: Asterisk / AudioSocket
Native integration with Asterisk via AudioSocket and WebSocket. Inbound and outbound call flows tested.
New: WebRTC Channel
Browser-based voice with Opus codec support and gRPC signalling, sharing the same hardened base as telephony.
Audio Pipeline: Deterministic Framing
- Exact 20 ms output frames with zero per-frame heap allocations
- Atomic interruption — `ClearOutputBuffer` drains buffers and signals output writer instantly
- Per-speaker recordings split into `assistant_recording_url` + `user_recording_url`
LLM Text Aggregator
Sentence-boundary aggregation between LLM stream and TTS — reduces first-word latency with configurable delimiters and clean context-switch flush.
Test Coverage
- 31 `BaseStreamer` unit tests
- Full telephony provider test suite (Twilio, Vonage, Exotel)
- Transformer tests for AssemblyAI, Azure, Cartesia, Deepgram, ElevenLabs, Google, Resemble, Sarvam
- LLM text aggregator: 972 lines of unit tests + 381 lines of benchmarks
Bug Fixes
- Google TTS stale response fix for outputs > 5 sentences
- AgentKit executor stability fixes
- First-token response time now tracked in LLM telemetry
- MCP tool support for agent tool invocations
Breaking Changes / Migrations
| Migration | Change |
|---|---|
| `000005` | New `call_contexts` table required |
| `000006` | `recording_url` split into `assistant_recording_url` + `user_recording_url` |
Rapida v0.1.3
New Features
Model Context Protocol (MCP) & Remote Agent Execution
- WebSocket-Based LLM Executor — Enable real-time, low-latency communication with language models via WebSocket integration for streaming responses
- Remote Executor and AgentKit (gRPC) — Run agents and models remotely with improved deployment flexibility and scalability
MCP Tool Implementation
- New tools added to expand integration capabilities with external services and APIs
Improvements
Frontend & Dependency Updates
- React Dependency Upgrades — Updated to the latest React dependencies for improved security and performance
- Cleaner Logging — Removal of unnecessary logs for a more focused development experience
- ESLint Fixes — Addressed outstanding lint errors to maintain codebase hygiene
CI/CD and Quality-of-Life Enhancements
- Optimized Build Pipeline — CI updated to skip CGO-dependent packages and make Trivy scans non-blocking for faster, more reliable builds
- Go Linting Improvements — Comprehensive auto-formatting and convention enforcement using golangci-lint, standardized to Go 1.25 in Docker and CI
- Dependency Security — Packages updated and audit processes improved for enhanced security postures
Stability & Refactoring
- Multiple under-the-hood improvements to enhance reliability and maintainability
Upgrade Considerations
- No breaking changes — Applications using existing features remain fully compatible
- Validation recommended — Applications utilizing new LLM execution paths or remote deployment features should be tested
- Reinstall dependencies — Developers should update dependencies with
npm installandgo mod download
Rapida v0.1.2
For Product Managers
New Features & Capabilities
- Session Management Controls - Max Session Duration, Idle Timeout, Timeout Message, Timeout Backoff
- Provider-Specific SSML Normalizers - Intelligent text normalizers per TTS provider for natural-sounding voice output across Azure, Google, and other providers
- Google STT Model Validation - All Google Speech-to-Text models tested with optimized default confidence threshold of 0.5
- Improved Turn Detection - Optimized conversation turn detection for natural human-AI voice interactions
New Provider Support
- Sarvam AI - Text-to-speech and speech-to-text (Indian language specialist)
- AssemblyAI - Speech-to-text provider with comprehensive language support
- Cartesia - Speech-to-text model support
- Azure Foundry & Vertex AI - Expanded text model options for LLM interactions
Telephony Improvements
- Unified Call Handling - Merged inbound and outbound call logic (Exotel)
- Intelligent Timeout Backoff - Better call experience with adaptive timeouts
Dashboard & UI
- V3 Dashboard - New experience with telephony visibility and STT validation
- Sentence Tokenizer for Debugger - Enhanced conversation analysis
- UI Message Sequencing - Improved message flow visualization
For Developers
New Features
- maxSessionDuration - Maximum allowed duration for a conversation session (in seconds). Enforces hard limit on conversation length to manage resources and costs
- idealTimeout - Idle timeout duration (in seconds). If no user input is detected within this period, the system prompts the user
- idealTimeoutMessage - Custom message displayed/spoken when idle timeout is triggered (e.g., "Are you still there?")
- idealTimeoutBackoff - Backoff interval (in seconds) after showing the timeout message before taking further action. Provides a grace period for user response
Backend Changes (Go)
- Model Executor - Fixed race conditions in concurrent execution
- Tool System - Refactored tool call creation, editing, and result handling
- End of Speech Detection - New system with configurable providers
- Config Validation - Added comprehensive config tests
Frontend Changes (React/TypeScript)
- Tool Components - Unified components with shared hooks and types
- Provider Configs - New JSON configs for STT/TTS models
- Sidebar Context - New context for sidebar state management
Performance Enhancements
- Text Conversations - No longer initializes audio transformer (performance improvement)
- Multi-Message UI - Fixed alignment for multiple messages per ID
Dependencies & Security
- React SDK submodule updated
- Node packages updated (yarn.lock)
- Dependabot security patches applied
- Added CodeQL analysis
- Fixed OAuth2 authentication flows
Summary
This release introduces comprehensive session management controls, expands provider support with new TTS/STT integrations, improves telephony handling, and delivers significant backend optimizations and performance improvements.
Rapida v0.1.1
New Integrations
Telephony
- Exotel integration
- Inbound and outbound call support
- Streaming audio pipeline wired into Rapida orchestrator
Call lifecycle events mapped cleanly to agent state
Speech-to-Text
- Sarvam STT integration
- Streaming transcription support
- Partial and final transcript handling
Improved latency consistency under load
- Bug Fixes & Stability Improvements
- Fixed audio stream desync issues during long-running calls
- Resolved intermittent end-of-speech detection edge cases
- Improved error handling when STT or TTS streams restart
- Fixed state leaks on abrupt call termination
- Reduced noisy logs during high-frequency audio ingestion
Reliability & Internal Improvements
- Safer handling of external provider timeouts
- Better retries and backoff for integration failures
- Clearer failure signals surfaced to the orchestrator
- Minor performance optimizations in streaming pipeline
Full Changelog: v1.0.0...v0.1.1