Skip to content

PaWeRe/siva

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

34 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

SIVA - Self-Improving Voice Agent Framework

Self-improving voice agent framework that learns from expert feedback by extracting important concepts from conversations and validating predictions with a local vector db of similar cases (used for few-shot prompting and confidence assessment) + optionally public knowledge sources. SIVA leverages Sierra's tau2-bench architecture for agent simulation and evaluation.

SIVA Demo - 10x Speed

๐Ÿš€ Quick Start

Requirements: Python 3.8+, uv package manager, browser with microphone access

  1. Setup Environment:

    # Using uv (automatically handles virtual environment)
    uv run python --version
  2. Configure API Keys (.env file):

    OPENAI_API_KEY=sk-your-key-here
    CARTESIA_API_KEY=your-cartesia-key-here
    DOMAIN_API_KEY=your-domain-specific-key-here  # For domain-specific evidence sources
    
  3. Launch the Voice Agent:

    uv run python run_voice_app.py

    Opens the voice client at http://localhost:3000/voice_client.html and dashboard at http://localhost:8000/dashboard

๐Ÿงช CLI Simulation & Testing

SIVA includes a comprehensive CLI for running agent simulations and testing different scenarios using the tau2-bench framework.

Basic Simulation Commands

Run a single patient intake simulation:

uv run python -m siva.cli run --domain patient_intake --agent llm_agent --user user_simulator --num-tasks 1 --max-steps 50

Run multiple tasks for comprehensive testing:

uv run python -m siva.cli run --domain patient_intake --agent llm_agent --user user_simulator --num-tasks 3 --max-steps 50

Test different agent types: TBD.

Simulation Results

After running simulations, view results with:

uv run python -m siva.cli view

Example Output:

โ•ญโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€ Simulation Overview โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ Task ID: patient_intake_PI001                                                                           โ”‚
โ”‚ Trial: 0                                                                                                โ”‚
โ”‚ Duration: 14.20s                                                                                        โ”‚
โ”‚ Termination Reason: TerminationReason.AGENT_STOP                                                         โ”‚
โ”‚ Agent Cost: $0.0218                                                                                     โ”‚
โ”‚ User Cost: $0.0021                                                                                      โ”‚
โ”‚ Reward: โœ… 1.0000 (ACTION: 1.0)                                                                         โ”‚
โ”‚                                                                                                          โ”‚
โ”‚ Action Checks:                                                                                          โ”‚
โ”‚ - 0: verify_fullname โœ… 1.0                                                                              โ”‚
โ”‚ - 1: verify_birthday โœ… 1.0                                                                              โ”‚
โ”‚ - 2: list_prescriptions โœ… 1.0                                                                           โ”‚
โ”‚ - 3: list_allergies โœ… 1.0                                                                               โ”‚
โ”‚ - 4: list_conditions โœ… 1.0                                                                              โ”‚
โ”‚ - 5: list_visit_reasons โœ… 1.0                                                                           โ”‚
โ”‚ - 6: determine_routing โœ… 1.0                                                                            โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Available Domains & Agents

  • Domains: patient_intake, patient_intake-workflow
  • Agents: llm_agent, llm_agent_solo, llm_agent_gt
  • Users: user_simulator, dummy_user

Note: SIVA uses the tau2-bench approach with LLM-based user simulators that generate responses dynamically based on task instructions, eliminating the need for hardcoded response logic.

๐Ÿš€ Next.js Dashboard & Modern Backend

Overview

SIVA now includes a modern Next.js dashboard that provides a web interface for running simulations, viewing results, and monitoring the learning system. This is built on top of the tau2-bench framework.

Running the Dashboard

1. Start the tau2-bench Backend:

# Start the new tau2-bench based backend
uv run python main_tau2.py

The backend will be available at http://localhost:8000

2. Start the Next.js Dashboard:

cd frontend/nextjs
npm run dev

The dashboard will be available at http://localhost:3000

3. Access the Dashboard:

  • Open http://localhost:3000 in your browser
  • Use the dashboard to run simulations, view results, and monitor performance

Dashboard Features

  • Overview: Performance metrics and recent simulations
  • Simulations: Run new simulations and view results
  • Learning: Monitor learning system status and improvements
  • Real-time Updates: Background simulation processing with status updates

API Endpoints

The new backend provides RESTful APIs:

  • GET /api/health - System health check
  • GET /api/domains - Available domains and agents
  • POST /api/simulations/run - Start new simulations
  • GET /api/simulations/status/{id} - Check simulation progress
  • GET /api/learning/summary - Learning system status

Migration Note

โš ๏ธ Temporary Setup: During migration, SIVA runs with two backends:

  • Legacy Backend (main.py) - Original voice agent functionality
  • Modern Backend (main_tau2.py) - New tau2-bench based system

The goal is to eventually consolidate to only the tau2-bench backend once migration is complete.

๐ŸŽฏ Use Cases & Applications

๐Ÿฅ Medical: Automated patient intake

Current Implementation: Patient intake and triage with "clinical pearl" (de-identified clinical decisions and reasonings) extraction

Key Features:

  • Voice-driven patient intake with symptom analysis
  • 5-category routing system (Emergency, Urgent, Routine, Self-Care, Information)
    • ๐Ÿšจ Emergency: Life-threatening conditions (chest pain, stroke signs, difficulty breathing)
    • โšก Urgent: Serious but not immediately life-threatening (high fever, severe pain)
    • ๐Ÿ“… Routine: Ongoing or non-urgent issues (mild symptoms, follow-ups, preventive care)
    • ๐Ÿ  Self-Care: Minor issues manageable at home (mild cold, minor headache)
    • โ„น๏ธ Information: Questions about medication, prevention, or general health advice
  • Key clinical decisions and reasoning detection from expert corrections and conversation transcripts (aka "clinical pearls")

Value Proposition: Captures unwritten clinical wisdom from physician conversations with zero overhead.

๐Ÿ”„ Self-Improvement Process

SIVA Framework Architecture

๐Ÿ”ง System Architecture

Core Components

  • Frontend: Pure HTML/JavaScript voice client with audio streaming
  • API Layer: FastAPI routes and WebSocket handlers for communication
  • Core Logic: Vector store + LLM judge + data manager for continuous improvement
  • Business Logic: Modular conversation processor with domain-specific routing
  • tau2-bench Integration: Simulation framework for dual-control agent evaluation with markdown-driven policies and task creation

AI Models & Usage

๐ŸŽค Speech Processing

  • STT: OpenAI Whisper v1 (whisper-1) - Speech to text conversion
  • TTS: Cartesia Sonic-2 (sonic-2) - Natural voice synthesis

๐Ÿง  Language Models

  • Main Agent: GPT-3.5 Turbo 1106 (gpt-3.5-turbo-1106) - Conversation processing with function calling
  • LLM Judge: GPT-3.5 Turbo (gpt-3.5-turbo) - Feedback analysis and knowledge extraction
  • Embeddings: text-embedding-3-small (text-embedding-3-small) - 1536D vectors for similarity search

๐Ÿ“Š Dashboard Monitoring

Real-time dashboard tracking: total conversations, vector store size, system accuracy, route distribution, learning progress, and recent activity. Access at http://localhost:8000/dashboard (auto-opens when using run_voice_app.py).

๐Ÿ“ Project Structure

siva/
โ”œโ”€โ”€ pyproject.toml             # Package configuration and dependencies
โ”œโ”€โ”€ main.py                    # Legacy FastAPI server entry point
โ”œโ”€โ”€ main_tau2.py               # New tau2-bench based backend server
โ”œโ”€โ”€ run_voice_app.py           # Application launcher
โ”œโ”€โ”€ serve_client.py            # Voice client server
โ”œโ”€โ”€ config/                    # Configuration management
โ”‚   โ””โ”€โ”€ settings.py            # Pydantic settings with env validation
โ”œโ”€โ”€ frontend/                  # Web interfaces
โ”‚   โ”œโ”€โ”€ voice_client.html      # Legacy voice interface
โ”‚   โ”œโ”€โ”€ dashboard.html         # Legacy performance monitoring
โ”‚   โ””โ”€โ”€ nextjs/               # Modern Next.js dashboard
โ”‚       โ”œโ”€โ”€ app/               # Next.js app router
โ”‚       โ”œโ”€โ”€ package.json       # Node.js dependencies
โ”‚       โ””โ”€โ”€ README.md          # Dashboard documentation
โ”œโ”€โ”€ src/siva/                  # Main application code
โ”‚   โ”œโ”€โ”€ agent/                 # Agent implementations
โ”‚   โ”œโ”€โ”€ api_service/           # API services and endpoints
โ”‚   โ”œโ”€โ”€ data_model/            # Data models and schemas
โ”‚   โ”œโ”€โ”€ domains/               # Domain-specific implementations
โ”‚   โ”œโ”€โ”€ environment/           # Environment and simulation logic
โ”‚   โ”œโ”€โ”€ evaluator/             # Evaluation and metrics
โ”‚   โ”œโ”€โ”€ orchestrator/          # Orchestration and workflow
โ”‚   โ””โ”€โ”€ utils/                 # Utility functions
โ”œโ”€โ”€ tests/                     # Test suite
โ”œโ”€โ”€ data/simulations/          # tau2-bench simulation data
โ”œโ”€โ”€ assets/                    # Media files
โ”‚   โ”œโ”€โ”€ siva_demo_10x.gif      # Demo recording
โ”‚   โ””โ”€โ”€ flowchart_self_learning_agent.jpeg # Architecture overview
โ””โ”€โ”€ siva_data/                 # Learning database + knowledge pearls

SIVA transforms voice interactions into continuously improving AI systems, capturing domain expertise and building collective intelligence across any field.

About

Self-improving voice agents.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published