Integrate `emr-demo.mdx` into the main EMR playbook and userguide.

---
title: "Building Production AI Agents with GAIA SDK"
description: "Medical Intake Agent Demo - From Concept to Production in Minutes"
duration: "15 minutes"
audience: "Technical"
---

# Building Production AI Agents with GAIA SDK

## Medical Intake Agent: From Concept to Production in Minutes

---

## The Problem: Manual Data Entry at Scale

### The Manual Process Pain Points

| Task | Time | Friction |
|------|------|----------|
| Locate each field on paper | 2-3 min | Scanning, cross-referencing |
| Type patient demographics | 3-4 min | Tab between 20+ fields |
| Transcribe medical history | 2-3 min | Handwriting interpretation |
| Enter insurance details | 2-3 min | Policy numbers, group IDs |
| **Total per form** | **8-12 min** | **High cognitive load** |

<Tip>
**Key Insight:** This isn't a technology problem Sarah can solve. She needs the paper forms for compliance. She needs the EMR for billing. The gap is the transcription.
</Tip>

---

## The Solution: Build It in 50 Lines with GAIA SDK

**The GAIA SDK makes building production-grade AI agents trivial.**

```python
from gaia.agents.base import Agent
from gaia.agents.base.tools import tool
from gaia.database import DatabaseMixin
from gaia.utils import FileWatcherMixin

class MedicalIntakeAgent(Agent, DatabaseMixin, FileWatcherMixin):
 """Agent that processes intake forms automatically."""

 def __init__(self, watch_dir="./intake_forms", db_path="./patients.db"):
 super().__init__()
 self.init_database(db_path, schema=PATIENT_SCHEMA)
 self.start_watching(watch_dir, patterns=["*.png", "*.jpg", "*.pdf"])

 def _on_file_created(self, path):
 """Triggered automatically when new file appears."""
 data = self._extract_with_vlm(path) # VLM extraction
 self._store_patient(data) # Save to database

 @tool
 def search_patients(self, query: str) -> dict:
 """Search patients by name, DOB, allergies, etc."""
 return self.query("SELECT * FROM patients WHERE ...")
```

**That's it.** GAIA handles file watching, VLM integration, database ops, and tool calling.

---

## What GAIA SDK Provides

### Pre-Built Components

```python
# Agent Base Classes
from gaia.agents.base import Agent # Core agent class
from gaia.agents.base import MCPAgent # MCP protocol support
from gaia.agents.base import ApiAgent # OpenAI-compatible API

# Mixins (Add via inheritance)
from gaia.database import DatabaseMixin # SQLite integration
from gaia.utils import FileWatcherMixin # File monitoring
from gaia.rag import RAGToolsMixin # Vector search
from gaia.shell import CLIToolsMixin # Shell execution

# LLM/VLM Clients
from gaia.llm import LLMClient # Text generation
from gaia.llm.vlm_client import VLMClient # Vision understanding
from gaia.audio import WhisperASR # Speech-to-text
from gaia.audio import KokoroTTS # Text-to-speech

# Decorators & Utils
from gaia.agents.base.tools import tool # Auto tool registration
from gaia.utils import compute_file_hash # File utilities
from gaia.utils import extract_json_from_text # JSON parsing
```

### What This Means

| You Write | GAIA Provides |
|-----------|---------------|
| Business logic | Infrastructure |
| Domain-specific tools | Generic tool framework |
| Data schema | Database management |
| VLM prompts | VLM client & retry logic |
| Agent behavior | Agent lifecycle & orchestration |

---

## Architecture: Agent + Mixins + Tools

### Full System Architecture

```mermaid
%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
flowchart TD
 subgraph Lemonade["Lemonade Server (AMD Ryzen AI NPU/GPU)"]
 L1(["Qwen3-VL-4B Vision Model"])
 L2(["Qwen3-Coder-30B Language Model"])
 end

 Lemonade <-.->|"Local REST API"| Pipeline

 subgraph Pipeline["GAIA Medical Intake Agent"]
 FW[/"FileWatcherMixin Monitors ./intake_forms"/] -->|"New file"| P1
 P1["1. File Read (retry logic)"] --> P2["2. Duplicate Check (SHA-256 hash)"]
 P2 --> P3["3. Image Optimize (resize, compress)"]
 P3 --> P4["4. VLM Extraction (call Lemonade)"]
 P4 --> P5["5. JSON Parse (validate fields)"]
 P5 --> P6["6. Patient Match (SQL queries)"]
 P6 --> P7[("7. Database Save (SQLite)")]
 end

 P7 --> Dashboard[/"Dashboard UI (Real-time SSE)"/]

 style FW fill:#F4484D,stroke:#ED1C24,stroke-width:3px,color:#fff
 style P1 fill:#f8f9fa,stroke:#dee2e6,stroke-width:2px,color:#495057
 style P2 fill:#f8f9fa,stroke:#dee2e6,stroke-width:2px,color:#495057
 style P3 fill:#F4484D,stroke:#ED1C24,stroke-width:2px,color:#fff
 style P4 fill:#ED1C24,stroke:#C8171E,stroke-width:2px,color:#fff
 style P5 fill:#F4484D,stroke:#ED1C24,stroke-width:2px,color:#fff
 style P6 fill:#F4484D,stroke:#ED1C24,stroke-width:2px,color:#fff
 style P7 fill:#6c757d,stroke:#495057,stroke-width:2px,color:#fff
 style Dashboard fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff
 style L1 fill:#ED1C24,stroke:#C8171E,stroke-width:2px,color:#fff
 style L2 fill:#ED1C24,stroke:#C8171E,stroke-width:2px,color:#fff
 style Lemonade fill:none,stroke:#666,stroke-width:2px,stroke-dasharray: 5 5
 style Pipeline fill:none,stroke:#ED1C24,stroke-width:3px,stroke-dasharray: 5 5

 linkStyle 0,1,2,3,4,5,6,7,8,9 stroke:#ED1C24,stroke-width:2px
```

**Three main components:**
1. **Lemonade Server** - Local AI inference (VLM + LLM)
2. **GAIA Agent** - Application logic with FileWatcher, image processing, and database ops
3. **Dashboard** - Real-time UI with SSE updates

---

### GAIA SDK Value Proposition

| Without GAIA | With GAIA SDK | Benefit |
|--------------|---------------|---------|
| 200+ lines VLM boilerplate | `self._vlm.process(image)` | 10x faster development |
| Manual file polling | `FileWatcherMixin` | Built-in retry & debounce |
| Raw SQL management | `DatabaseMixin` | Connection pooling & safety |
| Custom JSON schemas | `@tool` decorator | Auto-generated from types |
| Exception handling | Built-in retry logic | Production-ready errors |

**Time to Production:** Hours, not weeks.

```mermaid
%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
flowchart LR
 Form[/"📄 Intake Form"/] --> FW["FileWatcherMixin Auto-detects file"]
 FW -->|"Triggers"| Agent["MedicalIntakeAgent"]
 Agent --> Tools

 subgraph Tools["Agent Tools"]
 T1["@tool extract_form()"]
 T2["@tool save_patient()"]
 T3["@tool search_patients()"]
 end

 T1 <-->|"VLM API"| VLM["Lemonade Qwen3-VL-4B"]
 T2 --> DB["DatabaseMixin SQLite"]
 T3 --> DB

 DB --> UI[/"Dashboard"/]

 style FW fill:#F4484D,stroke:#ED1C24,stroke-width:3px,color:#fff
 style Agent fill:#ED1C24,stroke:#C8171E,stroke-width:3px,color:#fff
 style T1 fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff
 style T2 fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff
 style T3 fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff
 style VLM fill:#ED1C24,stroke:#C8171E,stroke-width:2px,color:#fff
 style DB fill:#6c757d,stroke:#495057,stroke-width:2px,color:#fff
 style Tools fill:none,stroke:#28a745,stroke-width:2px,stroke-dasharray: 5 5

 linkStyle 0,1,2,3,4,5,6,7 stroke:#ED1C24,stroke-width:2px
```

**Key:** FileWatcher triggers Agent → Agent calls Tools → Tools use Lemonade VLM & Database

---

## The Agent Pattern: Composition Over Configuration

```mermaid
%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
flowchart TD
 subgraph AgentClass["MedicalIntakeAgent Class"]
 Base["Agent (Base Class)"]
 DB["DatabaseMixin"]
 FW["FileWatcherMixin"]

 Base --> Methods
 DB --> Methods
 FW --> Methods

 Methods["Core Methods: _on_file_created() _extract_with_vlm() _find_existing_patient()"]
 end

 Methods --> Tools["Registered Tools"]

 subgraph Tools
 T1["@tool search_patients()"]
 T2["@tool get_patient()"]
 T3["@tool get_intake_stats()"]
 end

 FW -->|"File detected"| Trigger["Auto-triggers _on_file_created()"]
 Trigger --> VLM["Call Lemonade VLM API"]
 VLM --> Parse["Parse JSON & Validate"]
 Parse --> DB2["DatabaseMixin save to SQLite"]

 Tools --> LLM["Natural Language Queries via LLM"]

 style Base fill:#ED1C24,stroke:#C8171E,stroke-width:2px,color:#fff
 style DB fill:#F4484D,stroke:#ED1C24,stroke-width:2px,color:#fff
 style FW fill:#F4484D,stroke:#ED1C24,stroke-width:2px,color:#fff
 style Methods fill:#6c757d,stroke:#495057,stroke-width:2px,color:#fff
 style T1 fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff
 style T2 fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff
 style T3 fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff
 style AgentClass fill:none,stroke:#ED1C24,stroke-width:3px,stroke-dasharray: 5 5

 linkStyle 0,1,2,3,4,5,6,7,8,9 stroke:#ED1C24,stroke-width:2px
```

**Key Architecture Decisions:**
- **Multiple Inheritance:** Agent + DatabaseMixin + FileWatcherMixin
- **Tools Auto-Register:** `@tool` decorator automatically exposes to LLM
- **Event-Driven:** FileWatcher triggers `_on_file_created()` automatically

---

## Key Pattern 1: FileWatcherMixin

**Problem:** Monitor directories for new files with retry logic and debouncing.

**Without GAIA (50+ lines):**
```python
import time
from watchdog.observers import Observer
# Handle race conditions, file locking, duplicate events...
# Manual retry logic, debounce timers, pattern matching...
```

**With GAIA (3 lines):**
```python
class MyAgent(Agent, FileWatcherMixin):
 def __init__(self):
 super().__init__()
 self.start_watching("./intake_forms", patterns=["*.png", "*.pdf"])

 def _on_file_created(self, path):
 self.process_form(path) # File is ready, no race conditions
```

**Built-in features:**
- Retry with exponential backoff (Windows file locking)
- Duplicate event debouncing
- Glob pattern matching
- Thread-safe operations

---

## Key Pattern 2: DatabaseMixin

**Problem:** Persistent storage with connection pooling and SQL safety.

**Without GAIA (100+ lines):**
```python
import sqlite3
# Manual connection management, thread safety, migrations...
```

**With GAIA (5 lines):**
```python
class MyAgent(Agent, DatabaseMixin):
 def __init__(self):
 super().__init__()
 self.init_database("data.db", schema=PATIENT_SCHEMA)

 def find_patient(self, name):
 return self.query(
 "SELECT * FROM patients WHERE last_name = :name",
 {"name": name} # Parameterized - SQL injection safe
 )
```

**Built-in features:**
- Connection pooling
- Parameterized queries (prevents SQL injection)
- Schema initialization
- Transaction management

---

## Key Pattern 3: @tool Decorator

**Problem:** Expose Python functions to LLM for tool calling.

**Without GAIA (50+ lines per tool):**
```python
# Manually define JSON schema
# Register with LLM client
# Parse tool calls
# Validate arguments...
```

**With GAIA (decorator magic):**
```python
@tool
def search_patients(self, query: str, limit: int = 10) -> dict:
 """Search for patients by name, DOB, or other fields."""
 results = self.query(
 "SELECT * FROM patients WHERE last_name LIKE :q LIMIT :limit",
 {"q": f"%{query}%", "limit": limit}
 )
 return {"count": len(results), "patients": results}
```

**The decorator automatically:**
- Generates JSON schema from type hints
- Registers with agent's tool registry
- Validates arguments
- Handles errors gracefully

**Example natural language query:**
```
User: "Show me patients with penicillin allergies"
LLM: → Calls search_patients(query="penicillin")
Agent: → Returns 3 matching patients
LLM: → "I found 3 patients with penicillin allergies..."
```

---

## The 7-Step VLM Pipeline

### The Code Behind Each Step

```python
def _on_file_created(self, path: Path):
 """FileWatcherMixin auto-calls this when file appears."""

 # Step 1: Read file (with retry for file locking)
 file_content = self._read_file_with_retry(path)

 # Step 2: Duplicate check via hash
 file_hash = compute_file_hash(file_content)
 if self._is_duplicate(file_hash):
 return # Skip processing

 # Step 3: Image optimization
 optimized_image = self._optimize_image(file_content)

 # Step 4-5: VLM extraction (lazy loads model on first use)
 extracted_data = self._extract_with_vlm(optimized_image)

 # Step 6: Parse & validate JSON
 patient_data = self._parse_and_validate(extracted_data)

 # Step 7: Save via DatabaseMixin
 patient_id = self._store_patient(patient_data)
 self._create_alerts(patient_id, patient_data)
```

| Step | Time | GAIA SDK Component |
|------|------|-------------------|
| 1-2 | ~2s | `compute_file_hash()` util |
| 3 | ~1s | PIL + image utils |
| 4 | 0s | Lazy VLMClient init |
| 5 | 10-15s | Lemonade VLM inference |
| 6-7 | ~1s | `DatabaseMixin.query()` |
| **Total** | **~14-18s** | **vs. 8-12 min manual** |

---

## Image Optimization: Why It Matters

```mermaid
%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
flowchart TD
 A[/"Raw Image (4000x3000)"/] --> B(["EXIF Auto-Rotate"])
 B --> C(["Resize to 1024px Max"])
 C --> D(["Pad to Square"])
 D --> E[/"Optimized JPEG (85%)"/]

 style A fill:#f8f9fa,stroke:#dee2e6,stroke-width:2px,color:#495057
 style B fill:#F4484D,stroke:#ED1C24,stroke-width:2px,color:#fff
 style C fill:#ED1C24,stroke:#C8171E,stroke-width:2px,color:#fff
 style D fill:#F4484D,stroke:#ED1C24,stroke-width:2px,color:#fff
 style E fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff

 linkStyle 0,1,2,3 stroke:#ED1C24,stroke-width:2px
```

### Token Budget

| Component | Tokens | Notes |
|-----------|--------|-------|
| Image (1024x1024) | ~5,000 | 14x14 pixel patches |
| Extraction prompt | ~400 | 50+ field definitions |
| JSON output | ~800 | Filled form data |
| **Total** | **~6,200** | Fits in 8K context |

---

## VLM Integration: GAIA's VLMClient

**The GAIA SDK provides a unified VLM client abstraction:**

```python
from gaia.llm.vlm_client import VLMClient

class MedicalIntakeAgent(Agent):
 def __init__(self, vlm_model="Qwen3-VL-4B-Instruct-GGUF"):
 super().__init__()
 self._vlm_model = vlm_model
 self._vlm = None # Lazy load on first use

 def _extract_with_vlm(self, image_path: Path) -> dict:
 # Lazy initialize VLM (expensive operation)
 if self._vlm is None:
 self._vlm = VLMClient(model=self._vlm_model)

 # Optimize image (handled by GAIA utils)
 optimized = self._optimize_image(image_path)

 # Call VLM with structured prompt
 response = self._vlm.process(
 image=optimized,
 prompt=EXTRACTION_PROMPT,
 max_tokens=2048
 )

 # Parse JSON from VLM response
 return extract_json_from_text(response)
```

**What GAIA handles for you:**
- Model loading and caching
- Connection to Lemonade Server
- Error recovery and retry logic
- Image token counting
- JSON extraction from free-form text

```mermaid
%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
flowchart LR
 A[/"Intake Form"/] --> B["VLMClient.process()"]
 B <-->|"REST API"| C["Lemonade Server Qwen3-VL-4B"]
 C --> D["Structured JSON"]

 style A fill:#f8f9fa,stroke:#dee2e6,stroke-width:2px,color:#495057
 style B fill:#ED1C24,stroke:#C8171E,stroke-width:3px,color:#fff
 style C fill:#ED1C24,stroke:#C8171E,stroke-width:2px,color:#fff
 style D fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff

 linkStyle 0,1,2 stroke:#ED1C24,stroke-width:2px
```

### The Extraction Prompt (Excerpt)

```text
You are a medical data extraction system. Extract ALL patient
information from this intake form image.

Return a JSON object with ALL fields you can extract:

REQUIRED:
- "first_name": patient's first name
- "last_name": patient's last name

PATIENT INFO:
- "date_of_birth": YYYY-MM-DD format
- "gender": Male/Female/Other
- "ssn": Social Security Number (XXX-XX-XXXX)
- "phone", "mobile_phone", "email"
- "address", "city", "state", "zip_code"

EMERGENCY CONTACT:
- "emergency_contact_name"
- "emergency_contact_relationship"
- "emergency_contact_phone"

PRIMARY INSURANCE:
- "insurance_provider": insurance company name
- "insurance_id": policy number
- "insurance_group_number"

MEDICAL HISTORY:
- "reason_for_visit": chief complaint
- "allergies": known allergies
- "medications": current medications

IMPORTANT:
- Return ONLY the JSON object, no other text
- Use null for fields that exist but are blank
- Dates must be in YYYY-MM-DD format
```

### Example VLM Output

```json
{
 "first_name": "Alice",
 "last_name": "Williams",
 "date_of_birth": "1980-04-04",
 "gender": "Female",
 "phone": "(411) 413-1234",
 "mobile_phone": "(411) 555-7890",
 "email": "alice.williams@hotmail.com",
 "address": "123 Oak Street",
 "city": "Springfield",
 "state": "IL",
 "zip_code": "62701",
 "emergency_contact_name": "Robert Williams",
 "emergency_contact_relationship": "Spouse",
 "emergency_contact_phone": "(411) 413-5678",
 "insurance_provider": "Medicaid Demo",
 "insurance_id": "MCD-2024-87654",
 "insurance_group_number": "GRP-IL-001",
 "reason_for_visit": "Lower back pain for 3 weeks",
 "allergies": "Penicillin, Sulfa drugs",
 "medications": "Lisinopril 10mg daily, Metformin 500mg",
 "marital_status": "Married",
 "employment_status": "Employed",
 "employer": "Springfield Elementary School",
 "signature_date": "2024-01-15"
}
```

*34 fields extracted in 14.2 seconds from a single scanned image.*

### What Makes VLM Different from OCR?

| Traditional OCR | Vision Language Model |
|-----------------|----------------------|
| Extracts text blobs | Extracts **structured data** |
| Needs template zones | Handles **any form layout** |
| Fails on handwriting | Interprets **context** |
| No semantic understanding | Knows "DOB" = date of birth |

---

## Database Schema: Flexible by Design

```mermaid
%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
erDiagram
 PATIENTS ||--o{ ALERTS : has
 PATIENTS ||--o{ INTAKE_SESSIONS : has

 PATIENTS {
 int id PK
 text first_name
 text last_name
 text date_of_birth
 text phone
 text email
 text insurance_provider
 text allergies
 text medications
 text additional_fields
 blob file_content
 text file_hash UK
 }

 ALERTS {
 int id PK
 int patient_id FK
 text alert_type
 text message
 int acknowledged
 }

 INTAKE_SESSIONS {
 int id PK
 int patient_id FK
 int is_new_patient
 real processing_time_seconds
 text changes_detected
 }
```

---

## Returning Patient Detection: Simple SQL

**No embeddings needed** - straightforward SQL matching works reliably.

```python
def _find_existing_patient(self, data: dict) -> Optional[dict]:
 """Check if patient already exists via DatabaseMixin."""

 # Strategy 1: Match on name + DOB (most reliable)
 if data.get("date_of_birth"):
 results = self.query(
 """SELECT * FROM patients
 WHERE first_name = :fn AND last_name = :ln
 AND date_of_birth = :dob
 ORDER BY created_at DESC LIMIT 1""",
 {"fn": data["first_name"],
 "ln": data["last_name"],
 "dob": data["date_of_birth"]}
 )
 if results:
 return results[0]

 # Strategy 2: Fallback to name-only
 results = self.query(
 "SELECT * FROM patients WHERE first_name = :fn AND last_name = :ln",
 {"fn": data["first_name"], "ln": data["last_name"]}
 )
 return results[0] if results else None
```

**Detects changes automatically:**
- Insurance updates
- New medications
- Address changes
- Phone number updates

---

## Tool Calling: Exposing Agent Functions to LLM

### The @tool Decorator Pattern

```python
from gaia.agents.base.tools import tool

class MedicalIntakeAgent(Agent, DatabaseMixin):

 @tool
 def search_patients(self, query: str, limit: int = 10) -> dict:
 """
 Search for patients by name, DOB, or other fields.

 Args:
 query: Search query (name, DOB, allergies, etc.)
 limit: Maximum results to return

 Returns:
 Dict with matching patients
 """
 results = self.query(
 """SELECT id, first_name, last_name, date_of_birth,
 allergies, medications
 FROM patients
 WHERE first_name LIKE :q OR last_name LIKE :q
 OR allergies LIKE :q
 LIMIT :limit""",
 {"q": f"%{query}%", "limit": limit}
 )
 return {"count": len(results), "patients": results}

 @tool
 def get_patient(self, patient_id: int) -> dict:
 """Get full patient details by ID."""
 results = self.query(
 "SELECT * FROM patients WHERE id = :id",
 {"id": patient_id}
 )
 return results[0] if results else None

 @tool
 def get_intake_stats(self) -> dict:
 """Get processing statistics and time savings."""
 return {
 "total_patients": self._stats["total_patients"],
 "time_saved_hours": self._stats["total_time_saved"] / 3600
 }
```

### How It Works

```mermaid
%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
sequenceDiagram
 participant User
 participant Agent as MedicalIntakeAgent
 participant LLM as Lemonade (Qwen3-Coder)
 participant Tool as @tool search_patients()
 participant DB as DatabaseMixin

 User->>Agent: "Which patients have allergies?"
 Agent->>LLM: Send query + tool schemas
 LLM->>Agent: tool_call: search_patients(query="allergy")
 Agent->>Tool: Execute function
 Tool->>DB: Parameterized SQL query
 DB-->>Tool: [Alice, Bob, Charlie]
 Tool-->>Agent: Formatted results
 Agent->>LLM: Tool results
 LLM-->>User: "Found 3 patients with allergies..."
```

**Safety:** LLM never writes raw SQL. It calls predefined tools with validated parameters.

---

## Demo: Live Walkthrough

### Step 1: Launch the Dashboard

```bash
# Initialize models (first time only)
gaia-emr init

# Start the dashboard
gaia-emr dashboard
```

### Step 2: Drop a Form

1. Open the watch folder (`./intake_forms/`)
2. Drag a scanned intake form (PNG, JPG, or PDF)
3. Watch the live feed update in real-time

### Step 3: Explore the Data

- Click on the patient in the live feed
- View all extracted fields
- Check alerts for allergies or missing fields

### Step 4: Query with Natural Language

```
> Which patients were processed today?
> Show me patients with penicillin allergies
> Summarize the intake forms from this morning
```

---

## Video Demo

Watch the EMR Agent in action - from file drop to extracted patient data in under 20 seconds:

<video
 controls
 className="w-full aspect-video"
 src="https://assets.amd-gaia.ai/videos/gaia-emr-agent-demo.webm"
>
</video>

*Full end-to-end demonstration showing real-time processing, dashboard updates, and patient record extraction.*

---

## The Complete Agent: Putting It Together

```python
from gaia.agents.base import Agent
from gaia.agents.base.tools import tool
from gaia.database import DatabaseMixin
from gaia.utils import FileWatcherMixin
from gaia.llm.vlm_client import VLMClient

class MedicalIntakeAgent(Agent, DatabaseMixin, FileWatcherMixin):
 """Production-grade medical intake automation in ~500 lines."""

 def __init__(self, watch_dir, db_path, vlm_model):
 super().__init__()

 # Initialize database with schema
 self.init_database(db_path, schema=PATIENT_SCHEMA)

 # Start file watching
 self.start_watching(watch_dir, patterns=["*.png", "*.jpg", "*.pdf"])

 # Lazy load VLM
 self._vlm = None
 self._vlm_model = vlm_model

 def _on_file_created(self, path):
 """Auto-triggered by FileWatcherMixin."""
 patient_data = self._process_intake_form(path)
 self._store_patient(patient_data)

 @tool
 def search_patients(self, query: str) -> dict:
 """Exposed to LLM for natural language queries."""
 return self.query("SELECT * FROM patients WHERE ...")

# That's the entire structure. GAIA provides the rest.
```

**Lines of code comparison:**

| Component | DIY Implementation | With GAIA SDK |
|-----------|-------------------|---------------|
| File watching & retry | ~150 lines | 1 line (inherit mixin) |
| Database connection | ~100 lines | 1 line (inherit mixin) |
| VLM integration | ~200 lines | ~20 lines (use VLMClient) |
| Tool registration | ~50 lines per tool | 1 decorator per tool |
| **Total boilerplate** | **~500+ lines** | **~50 lines** |

---

## Time Savings: Development + Operational ROI

```mermaid
%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
pie showData
 title "Time per Form"
 "Manual Entry" : 600
 "AI Agent" : 15
```

### Operational ROI

| Metric | Value |
|--------|-------|
| Forms processed per day | 15 |
| Manual time per form | 10 min |
| Agent time per form | 15 sec |
| **Daily savings** | **~2.5 hours** |
| **Monthly savings** | **~50 hours** |
| **Annual cost savings** | **~$52,000** |

### Development ROI

| Metric | DIY | With GAIA SDK |
|--------|-----|---------------|
| Development time | 2-3 weeks | 1-2 days |
| Lines of code | ~2,000+ | ~500 |
| Production readiness | Months of testing | Built-in patterns |
| Maintenance burden | High (custom framework) | Low (SDK updates) |

---

## The GAIA + Lemonade Stack

### Why Local AI Matters

| Factor | Cloud API | GAIA + Lemonade (Local) |
|--------|-----------|------------------------|
| **Privacy** | Data leaves device | Data stays on device |
| **Latency** | 500ms+ network RTT | Sub-100ms inference |
| **Cost per form** | $0.05-0.15 | $0.00 |
| **Offline capability** | Requires internet | Works offline |
| **HIPAA compliance** | Complex BAA required | Simplified (data never leaves) |
| **Scalability** | Pay per request | Fixed hardware cost |

### The AMD Advantage

**Lemonade Server** = AMD's local inference engine optimized for Ryzen AI

- **NPU Acceleration:** Dedicated neural processor for efficient VLM/LLM inference
- **Unified Memory:** Fast model loading, no CPU↔GPU memory transfers
- **Power Efficiency:** Runs on laptop power (15-45W), not datacenter GPUs (300W+)
- **Multi-Model:** Run VLM + LLM + TTS/ASR simultaneously
- **OpenAI-Compatible:** Drop-in replacement for cloud APIs

**GAIA SDK** = Framework for building agents on top of Lemonade

- Pre-built mixins (Database, FileWatcher, RAG, etc.)
- Tool decorator for LLM function calling
- Agent orchestration patterns
- Production-ready error handling

---

## Beyond EMR: The Same Patterns Apply

### GAIA SDK Component Library

```python
# Mix and match components for your use case

from gaia.agents.base import Agent
from gaia.agents.base.tools import tool

# Mixins (add capabilities via inheritance)
from gaia.database import DatabaseMixin
from gaia.utils import FileWatcherMixin
from gaia.rag import RAGToolsMixin
from gaia.shell import CLIToolsMixin

# Clients (call AI models)
from gaia.llm import LLMClient
from gaia.llm.vlm_client import VLMClient
from gaia.audio import WhisperASR, KokoroTTS
```

### Example: Other Agent Patterns

| Use Case | Mixins | Tools | Models |
|----------|--------|-------|--------|
| **Document Q&A** | Agent + RAGToolsMixin | @tool search_docs() | LLM + Embeddings |
| **Code Generation** | Agent + CLIToolsMixin | @tool run_tests() | LLM |
| **Voice Assistant** | Agent + DatabaseMixin | @tool set_reminder() | LLM + ASR + TTS |
| **Jira Automation** | Agent + WebToolsMixin | @tool create_issue() | LLM |
| **3D Scene Gen** | Agent + FileWatcherMixin | @tool render_scene() | LLM + Blender API |

**Same SDK, endless applications.**

---

## Why Choose GAIA SDK?

### The Value Proposition

1. **Rapid Development**
 - Production-grade agents in hours, not weeks
 - Pre-built mixins eliminate boilerplate
 - Focus on business logic, not infrastructure

2. **Best Practices Built-In**
 - Error handling with retry logic
 - Thread-safe operations
 - SQL injection prevention
 - Token budget management

3. **Local-First Architecture**
 - Complete data privacy
 - Sub-100ms latency
 - Zero per-request costs
 - Works offline

4. **AMD Hardware Optimized**
 - NPU/GPU acceleration via Lemonade
 - Efficient memory usage
 - Multi-model orchestration

5. **Extensible & Composable**
 - Mix and match mixins
 - Build custom tools
 - Integrate existing APIs

---

## Key Takeaways

### What We Demonstrated

1. **Agent Pattern**
 - Inherit from `Agent` base class
 - Add mixins for capabilities (Database, FileWatcher, RAG, etc.)
 - Focus on business logic, not infrastructure

2. **FileWatcherMixin**
 - Automatic file monitoring with retry logic
 - Built-in debouncing and file locking handling
 - Event-driven architecture (no polling)

3. **DatabaseMixin**
 - Connection pooling and thread safety
 - Parameterized queries (SQL injection safe)
 - Schema initialization and migrations

4. **@tool Decorator**
 - Expose Python functions to LLM
 - Auto-generate JSON schemas from type hints
 - Enable natural language interfaces

5. **VLMClient**
 - Unified API for vision models
 - Automatic token management
 - Connection to local Lemonade Server

### The GAIA Value Proposition

**Build production-grade AI agents in hours, not weeks:**
- ✅ 10x less boilerplate code
- ✅ Best practices built-in
- ✅ Local-first architecture (privacy, latency, cost)
- ✅ AMD hardware optimized
- ✅ Extensible and composable

---

## Getting Started with GAIA SDK

### Installation

```bash
# Install GAIA SDK
pip install amd-gaia

# Install Lemonade Server
pip install lemonade-server

# Download models for this demo
gaia-emr init
```

### Build Your First Agent (5 Steps)

```python
# 1. Import GAIA components
from gaia.agents.base import Agent
from gaia.agents.base.tools import tool
from gaia.database import DatabaseMixin

# 2. Define your agent class
class MyAgent(Agent, DatabaseMixin):
 def __init__(self):
 super().__init__()
 self.init_database("my_data.db", schema=MY_SCHEMA)

 # 3. Add tools with the decorator
 @tool
 def my_tool(self, param: str) -> dict:
 """Your custom business logic."""
 return self.query("SELECT ...")

# 4. Instantiate and run
agent = MyAgent()

# 5. Query with natural language
response = agent.process_query("Find all records from today")
```

---

## Q&A

### SDK Questions

**Q: Can I use other models besides Qwen?**
A: Yes - GAIA works with any GGUF model via Lemonade, or cloud APIs (OpenAI, Anthropic, etc.)

**Q: Does it work on non-AMD hardware?**
A: Yes - CPU-only mode works everywhere. AMD NPU/GPU provides 5-10x speedup.

**Q: What's the learning curve?**
A: If you know Python and basic SQL, you can build agents in an afternoon.

**Q: Can I deploy this in production?**
A: Yes - includes Docker support, API server, monitoring, and audit trails.

### EMR-Specific Questions

**Q: What form layouts does it support?**
A: Any medical intake form. VLMs understand context, not templates.

**Q: How do I integrate with Epic/Cerner?**
A: Use the REST API or sync the SQLite database via your EMR's integration API.

---

## Resources

### Get Started with GAIA SDK

- **SDK Documentation:** [amd-gaia.ai/sdk](https://amd-gaia.ai/sdk)
- **Agent System Guide:** [Core Concepts](https://amd-gaia.ai/sdk/core/agent-system)
- **Tool Decorator:** [Function Calling](https://amd-gaia.ai/sdk/core/tools)
- **Database Mixin:** [Persistence Patterns](https://amd-gaia.ai/sdk/mixins/database-mixin)
- **File Watcher:** [File Monitoring](https://amd-gaia.ai/sdk/utils/file-watcher)

### EMR Agent Specifics

- **EMR Guide:** [Medical Intake Agent](https://amd-gaia.ai/guides/emr)
- **Playbook:** [Build from Scratch](https://amd-gaia.ai/playbooks/emr-agent/part-1-getting-started)
- **Source Code:** [GitHub - src/gaia/agents/emr/](https://github.com/amd/gaia/tree/main/src/gaia/agents/emr)

### Installation

```bash
pip install amd-gaia lemonade-server
```

### Community

- **GitHub:** [github.com/amd/gaia](https://github.com/amd/gaia)
- **Discord:** [GAIA Community](https://discord.gg/amd-gaia)
- **Contact:** gaia@amd.com

---

**License:** MIT · Copyright (C) 2024-2025 Advanced Micro Devices, Inc.

Task	Time	Friction
Locate each field on paper	2-3 min	Scanning, cross-referencing
Type patient demographics	3-4 min	Tab between 20+ fields
Transcribe medical history	2-3 min	Handwriting interpretation
Enter insurance details	2-3 min	Policy numbers, group IDs
Total per form	8-12 min	High cognitive load

You Write	GAIA Provides
Business logic	Infrastructure
Domain-specific tools	Generic tool framework
Data schema	Database management
VLM prompts	VLM client & retry logic
Agent behavior	Agent lifecycle & orchestration

Without GAIA	With GAIA SDK	Benefit
200+ lines VLM boilerplate	`self._vlm.process(image)`	10x faster development
Manual file polling	`FileWatcherMixin`	Built-in retry & debounce
Raw SQL management	`DatabaseMixin`	Connection pooling & safety
Custom JSON schemas	`@tool` decorator	Auto-generated from types
Exception handling	Built-in retry logic	Production-ready errors

Step	Time	GAIA SDK Component
1-2	~2s	`compute_file_hash()` util
3	~1s	PIL + image utils
4	0s	Lazy VLMClient init
5	10-15s	Lemonade VLM inference
6-7	~1s	`DatabaseMixin.query()`
Total	~14-18s	vs. 8-12 min manual

Component	Tokens	Notes
Image (1024x1024)	~5,000	14x14 pixel patches
Extraction prompt	~400	50+ field definitions
JSON output	~800	Filled form data
Total	~6,200	Fits in 8K context

Traditional OCR	Vision Language Model
Extracts text blobs	Extracts structured data
Needs template zones	Handles any form layout
Fails on handwriting	Interprets context
No semantic understanding	Knows "DOB" = date of birth

Component	DIY Implementation	With GAIA SDK
File watching & retry	~150 lines	1 line (inherit mixin)
Database connection	~100 lines	1 line (inherit mixin)
VLM integration	~200 lines	~20 lines (use VLMClient)
Tool registration	~50 lines per tool	1 decorator per tool
Total boilerplate	~500+ lines	~50 lines

Metric	Value
Forms processed per day	15
Manual time per form	10 min
Agent time per form	15 sec
Daily savings	~2.5 hours
Monthly savings	~50 hours
Annual cost savings	~$52,000

Metric	DIY	With GAIA SDK
Development time	2-3 weeks	1-2 days
Lines of code	~2,000+	~500
Production readiness	Months of testing	Built-in patterns
Maintenance burden	High (custom framework)	Low (SDK updates)

Factor	Cloud API	GAIA + Lemonade (Local)
Privacy	Data leaves device	Data stays on device
Latency	500ms+ network RTT	Sub-100ms inference
Cost per form	$0.05-0.15	$0.00
Offline capability	Requires internet	Works offline
HIPAA compliance	Complex BAA required	Simplified (data never leaves)
Scalability	Pay per request	Fixed hardware cost

Use Case	Mixins	Tools	Models
Document Q&A	Agent + RAGToolsMixin	@tool search_docs()	LLM + Embeddings
Code Generation	Agent + CLIToolsMixin	@tool run_tests()	LLM
Voice Assistant	Agent + DatabaseMixin	@tool set_reminder()	LLM + ASR + TTS
Jira Automation	Agent + WebToolsMixin	@tool create_issue()	LLM
3D Scene Gen	Agent + FileWatcherMixin	@tool render_scene()	LLM + Blender API

Integrate emr-demo.mdx into the main EMR playbook and userguide. #297

Description

title: "Building Production AI Agents with GAIA SDK" description: "Medical Intake Agent Demo - From Concept to Production in Minutes" duration: "15 minutes" audience: "Technical"

Building Production AI Agents with GAIA SDK

Medical Intake Agent: From Concept to Production in Minutes

The Problem: Manual Data Entry at Scale

The Manual Process Pain Points

The Solution: Build It in 50 Lines with GAIA SDK

What GAIA SDK Provides

Pre-Built Components

What This Means

Architecture: Agent + Mixins + Tools

Full System Architecture

GAIA SDK Value Proposition

The Agent Pattern: Composition Over Configuration

Key Pattern 1: FileWatcherMixin

Key Pattern 2: DatabaseMixin

Key Pattern 3: @tool Decorator

The 7-Step VLM Pipeline

The Code Behind Each Step

Image Optimization: Why It Matters

Token Budget

VLM Integration: GAIA's VLMClient

The Extraction Prompt (Excerpt)

Example VLM Output

What Makes VLM Different from OCR?

Database Schema: Flexible by Design

Returning Patient Detection: Simple SQL

Tool Calling: Exposing Agent Functions to LLM

The @tool Decorator Pattern

How It Works

Demo: Live Walkthrough

Step 1: Launch the Dashboard

Step 2: Drop a Form

Step 3: Explore the Data

Step 4: Query with Natural Language

Video Demo

The Complete Agent: Putting It Together

Time Savings: Development + Operational ROI

Operational ROI

Development ROI

The GAIA + Lemonade Stack

Why Local AI Matters

The AMD Advantage

Beyond EMR: The Same Patterns Apply

GAIA SDK Component Library

Example: Other Agent Patterns

Why Choose GAIA SDK?

The Value Proposition

Key Takeaways

What We Demonstrated

The GAIA Value Proposition

Getting Started with GAIA SDK

Installation

Build Your First Agent (5 Steps)

Q&A

SDK Questions

EMR-Specific Questions

Resources

Get Started with GAIA SDK

EMR Agent Specifics

Installation

Community

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Integrate `emr-demo.mdx` into the main EMR playbook and userguide. #297

title: "Building Production AI Agents with GAIA SDK"
description: "Medical Intake Agent Demo - From Concept to Production in Minutes"
duration: "15 minutes"
audience: "Technical"