Skip to content

Integrate emr-demo.mdx into the main EMR playbook and userguide. #297

@kovtcharov

Description

@kovtcharov

title: "Building Production AI Agents with GAIA SDK"
description: "Medical Intake Agent Demo - From Concept to Production in Minutes"
duration: "15 minutes"
audience: "Technical"

Building Production AI Agents with GAIA SDK

Medical Intake Agent: From Concept to Production in Minutes


The Problem: Manual Data Entry at Scale

The Manual Process Pain Points

Task Time Friction
Locate each field on paper 2-3 min Scanning, cross-referencing
Type patient demographics 3-4 min Tab between 20+ fields
Transcribe medical history 2-3 min Handwriting interpretation
Enter insurance details 2-3 min Policy numbers, group IDs
Total per form 8-12 min High cognitive load
**Key Insight:** This isn't a technology problem Sarah can solve. She needs the paper forms for compliance. She needs the EMR for billing. The gap is the transcription.

The Solution: Build It in 50 Lines with GAIA SDK

The GAIA SDK makes building production-grade AI agents trivial.

from gaia.agents.base import Agent
from gaia.agents.base.tools import tool
from gaia.database import DatabaseMixin
from gaia.utils import FileWatcherMixin

class MedicalIntakeAgent(Agent, DatabaseMixin, FileWatcherMixin):
    """Agent that processes intake forms automatically."""

    def __init__(self, watch_dir="./intake_forms", db_path="./patients.db"):
        super().__init__()
        self.init_database(db_path, schema=PATIENT_SCHEMA)
        self.start_watching(watch_dir, patterns=["*.png", "*.jpg", "*.pdf"])

    def _on_file_created(self, path):
        """Triggered automatically when new file appears."""
        data = self._extract_with_vlm(path)  # VLM extraction
        self._store_patient(data)            # Save to database

    @tool
    def search_patients(self, query: str) -> dict:
        """Search patients by name, DOB, allergies, etc."""
        return self.query("SELECT * FROM patients WHERE ...")

That's it. GAIA handles file watching, VLM integration, database ops, and tool calling.


What GAIA SDK Provides

Pre-Built Components

# Agent Base Classes
from gaia.agents.base import Agent          # Core agent class
from gaia.agents.base import MCPAgent       # MCP protocol support
from gaia.agents.base import ApiAgent       # OpenAI-compatible API

# Mixins (Add via inheritance)
from gaia.database import DatabaseMixin     # SQLite integration
from gaia.utils import FileWatcherMixin     # File monitoring
from gaia.rag import RAGToolsMixin          # Vector search
from gaia.shell import CLIToolsMixin        # Shell execution

# LLM/VLM Clients
from gaia.llm import LLMClient              # Text generation
from gaia.llm.vlm_client import VLMClient   # Vision understanding
from gaia.audio import WhisperASR           # Speech-to-text
from gaia.audio import KokoroTTS            # Text-to-speech

# Decorators & Utils
from gaia.agents.base.tools import tool     # Auto tool registration
from gaia.utils import compute_file_hash    # File utilities
from gaia.utils import extract_json_from_text # JSON parsing

What This Means

You Write GAIA Provides
Business logic Infrastructure
Domain-specific tools Generic tool framework
Data schema Database management
VLM prompts VLM client & retry logic
Agent behavior Agent lifecycle & orchestration

Architecture: Agent + Mixins + Tools

Full System Architecture

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
flowchart TD
    subgraph Lemonade["Lemonade Server (AMD Ryzen AI NPU/GPU)"]
        L1(["Qwen3-VL-4B<br/>Vision Model"])
        L2(["Qwen3-Coder-30B<br/>Language Model"])
    end

    Lemonade <-.->|"Local REST API"| Pipeline

    subgraph Pipeline["GAIA Medical Intake Agent"]
        FW[/"FileWatcherMixin<br/>Monitors ./intake_forms"/] -->|"New file"| P1
        P1["1. File Read<br/>(retry logic)"] --> P2["2. Duplicate Check<br/>(SHA-256 hash)"]
        P2 --> P3["3. Image Optimize<br/>(resize, compress)"]
        P3 --> P4["4. VLM Extraction<br/>(call Lemonade)"]
        P4 --> P5["5. JSON Parse<br/>(validate fields)"]
        P5 --> P6["6. Patient Match<br/>(SQL queries)"]
        P6 --> P7[("7. Database Save<br/>(SQLite)")]
    end

    P7 --> Dashboard[/"Dashboard UI<br/>(Real-time SSE)"/]

    style FW fill:#F4484D,stroke:#ED1C24,stroke-width:3px,color:#fff
    style P1 fill:#f8f9fa,stroke:#dee2e6,stroke-width:2px,color:#495057
    style P2 fill:#f8f9fa,stroke:#dee2e6,stroke-width:2px,color:#495057
    style P3 fill:#F4484D,stroke:#ED1C24,stroke-width:2px,color:#fff
    style P4 fill:#ED1C24,stroke:#C8171E,stroke-width:2px,color:#fff
    style P5 fill:#F4484D,stroke:#ED1C24,stroke-width:2px,color:#fff
    style P6 fill:#F4484D,stroke:#ED1C24,stroke-width:2px,color:#fff
    style P7 fill:#6c757d,stroke:#495057,stroke-width:2px,color:#fff
    style Dashboard fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff
    style L1 fill:#ED1C24,stroke:#C8171E,stroke-width:2px,color:#fff
    style L2 fill:#ED1C24,stroke:#C8171E,stroke-width:2px,color:#fff
    style Lemonade fill:none,stroke:#666,stroke-width:2px,stroke-dasharray: 5 5
    style Pipeline fill:none,stroke:#ED1C24,stroke-width:3px,stroke-dasharray: 5 5

    linkStyle 0,1,2,3,4,5,6,7,8,9 stroke:#ED1C24,stroke-width:2px
Loading

Three main components:

  1. Lemonade Server - Local AI inference (VLM + LLM)
  2. GAIA Agent - Application logic with FileWatcher, image processing, and database ops
  3. Dashboard - Real-time UI with SSE updates

GAIA SDK Value Proposition

Without GAIA With GAIA SDK Benefit
200+ lines VLM boilerplate self._vlm.process(image) 10x faster development
Manual file polling FileWatcherMixin Built-in retry & debounce
Raw SQL management DatabaseMixin Connection pooling & safety
Custom JSON schemas @tool decorator Auto-generated from types
Exception handling Built-in retry logic Production-ready errors

Time to Production: Hours, not weeks.

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
flowchart LR
    Form[/"📄 Intake Form"/] --> FW["FileWatcherMixin<br/>Auto-detects file"]
    FW -->|"Triggers"| Agent["MedicalIntakeAgent"]
    Agent --> Tools

    subgraph Tools["Agent Tools"]
        T1["@tool extract_form()"]
        T2["@tool save_patient()"]
        T3["@tool search_patients()"]
    end

    T1 <-->|"VLM API"| VLM["Lemonade<br/>Qwen3-VL-4B"]
    T2 --> DB["DatabaseMixin<br/>SQLite"]
    T3 --> DB

    DB --> UI[/"Dashboard"/]

    style FW fill:#F4484D,stroke:#ED1C24,stroke-width:3px,color:#fff
    style Agent fill:#ED1C24,stroke:#C8171E,stroke-width:3px,color:#fff
    style T1 fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff
    style T2 fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff
    style T3 fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff
    style VLM fill:#ED1C24,stroke:#C8171E,stroke-width:2px,color:#fff
    style DB fill:#6c757d,stroke:#495057,stroke-width:2px,color:#fff
    style Tools fill:none,stroke:#28a745,stroke-width:2px,stroke-dasharray: 5 5

    linkStyle 0,1,2,3,4,5,6,7 stroke:#ED1C24,stroke-width:2px
Loading

Key: FileWatcher triggers Agent → Agent calls Tools → Tools use Lemonade VLM & Database


The Agent Pattern: Composition Over Configuration

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
flowchart TD
    subgraph AgentClass["MedicalIntakeAgent Class"]
        Base["Agent (Base Class)"]
        DB["DatabaseMixin"]
        FW["FileWatcherMixin"]

        Base --> Methods
        DB --> Methods
        FW --> Methods

        Methods["Core Methods:<br/>_on_file_created()<br/>_extract_with_vlm()<br/>_find_existing_patient()"]
    end

    Methods --> Tools["Registered Tools"]

    subgraph Tools
        T1["@tool search_patients()"]
        T2["@tool get_patient()"]
        T3["@tool get_intake_stats()"]
    end

    FW -->|"File detected"| Trigger["Auto-triggers<br/>_on_file_created()"]
    Trigger --> VLM["Call Lemonade<br/>VLM API"]
    VLM --> Parse["Parse JSON<br/>& Validate"]
    Parse --> DB2["DatabaseMixin<br/>save to SQLite"]

    Tools --> LLM["Natural Language<br/>Queries via LLM"]

    style Base fill:#ED1C24,stroke:#C8171E,stroke-width:2px,color:#fff
    style DB fill:#F4484D,stroke:#ED1C24,stroke-width:2px,color:#fff
    style FW fill:#F4484D,stroke:#ED1C24,stroke-width:2px,color:#fff
    style Methods fill:#6c757d,stroke:#495057,stroke-width:2px,color:#fff
    style T1 fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff
    style T2 fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff
    style T3 fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff
    style AgentClass fill:none,stroke:#ED1C24,stroke-width:3px,stroke-dasharray: 5 5

    linkStyle 0,1,2,3,4,5,6,7,8,9 stroke:#ED1C24,stroke-width:2px
Loading

Key Architecture Decisions:

  • Multiple Inheritance: Agent + DatabaseMixin + FileWatcherMixin
  • Tools Auto-Register: @tool decorator automatically exposes to LLM
  • Event-Driven: FileWatcher triggers _on_file_created() automatically

Key Pattern 1: FileWatcherMixin

Problem: Monitor directories for new files with retry logic and debouncing.

Without GAIA (50+ lines):

import time
from watchdog.observers import Observer
# Handle race conditions, file locking, duplicate events...
# Manual retry logic, debounce timers, pattern matching...

With GAIA (3 lines):

class MyAgent(Agent, FileWatcherMixin):
    def __init__(self):
        super().__init__()
        self.start_watching("./intake_forms", patterns=["*.png", "*.pdf"])

    def _on_file_created(self, path):
        self.process_form(path)  # File is ready, no race conditions

Built-in features:

  • Retry with exponential backoff (Windows file locking)
  • Duplicate event debouncing
  • Glob pattern matching
  • Thread-safe operations

Key Pattern 2: DatabaseMixin

Problem: Persistent storage with connection pooling and SQL safety.

Without GAIA (100+ lines):

import sqlite3
# Manual connection management, thread safety, migrations...

With GAIA (5 lines):

class MyAgent(Agent, DatabaseMixin):
    def __init__(self):
        super().__init__()
        self.init_database("data.db", schema=PATIENT_SCHEMA)

    def find_patient(self, name):
        return self.query(
            "SELECT * FROM patients WHERE last_name = :name",
            {"name": name}  # Parameterized - SQL injection safe
        )

Built-in features:

  • Connection pooling
  • Parameterized queries (prevents SQL injection)
  • Schema initialization
  • Transaction management

Key Pattern 3: @tool Decorator

Problem: Expose Python functions to LLM for tool calling.

Without GAIA (50+ lines per tool):

# Manually define JSON schema
# Register with LLM client
# Parse tool calls
# Validate arguments...

With GAIA (decorator magic):

@tool
def search_patients(self, query: str, limit: int = 10) -> dict:
    """Search for patients by name, DOB, or other fields."""
    results = self.query(
        "SELECT * FROM patients WHERE last_name LIKE :q LIMIT :limit",
        {"q": f"%{query}%", "limit": limit}
    )
    return {"count": len(results), "patients": results}

The decorator automatically:

  • Generates JSON schema from type hints
  • Registers with agent's tool registry
  • Validates arguments
  • Handles errors gracefully

Example natural language query:

User: "Show me patients with penicillin allergies"
LLM: → Calls search_patients(query="penicillin")
Agent: → Returns 3 matching patients
LLM: → "I found 3 patients with penicillin allergies..."

The 7-Step VLM Pipeline

The Code Behind Each Step

def _on_file_created(self, path: Path):
    """FileWatcherMixin auto-calls this when file appears."""

    # Step 1: Read file (with retry for file locking)
    file_content = self._read_file_with_retry(path)

    # Step 2: Duplicate check via hash
    file_hash = compute_file_hash(file_content)
    if self._is_duplicate(file_hash):
        return  # Skip processing

    # Step 3: Image optimization
    optimized_image = self._optimize_image(file_content)

    # Step 4-5: VLM extraction (lazy loads model on first use)
    extracted_data = self._extract_with_vlm(optimized_image)

    # Step 6: Parse & validate JSON
    patient_data = self._parse_and_validate(extracted_data)

    # Step 7: Save via DatabaseMixin
    patient_id = self._store_patient(patient_data)
    self._create_alerts(patient_id, patient_data)
Step Time GAIA SDK Component
1-2 ~2s compute_file_hash() util
3 ~1s PIL + image utils
4 0s Lazy VLMClient init
5 10-15s Lemonade VLM inference
6-7 ~1s DatabaseMixin.query()
Total ~14-18s vs. 8-12 min manual

Image Optimization: Why It Matters

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
flowchart TD
    A[/"Raw Image (4000x3000)"/] --> B(["EXIF Auto-Rotate"])
    B --> C(["Resize to 1024px Max"])
    C --> D(["Pad to Square"])
    D --> E[/"Optimized JPEG (85%)"/]

    style A fill:#f8f9fa,stroke:#dee2e6,stroke-width:2px,color:#495057
    style B fill:#F4484D,stroke:#ED1C24,stroke-width:2px,color:#fff
    style C fill:#ED1C24,stroke:#C8171E,stroke-width:2px,color:#fff
    style D fill:#F4484D,stroke:#ED1C24,stroke-width:2px,color:#fff
    style E fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff

    linkStyle 0,1,2,3 stroke:#ED1C24,stroke-width:2px
Loading

Token Budget

Component Tokens Notes
Image (1024x1024) ~5,000 14x14 pixel patches
Extraction prompt ~400 50+ field definitions
JSON output ~800 Filled form data
Total ~6,200 Fits in 8K context

VLM Integration: GAIA's VLMClient

The GAIA SDK provides a unified VLM client abstraction:

from gaia.llm.vlm_client import VLMClient

class MedicalIntakeAgent(Agent):
    def __init__(self, vlm_model="Qwen3-VL-4B-Instruct-GGUF"):
        super().__init__()
        self._vlm_model = vlm_model
        self._vlm = None  # Lazy load on first use

    def _extract_with_vlm(self, image_path: Path) -> dict:
        # Lazy initialize VLM (expensive operation)
        if self._vlm is None:
            self._vlm = VLMClient(model=self._vlm_model)

        # Optimize image (handled by GAIA utils)
        optimized = self._optimize_image(image_path)

        # Call VLM with structured prompt
        response = self._vlm.process(
            image=optimized,
            prompt=EXTRACTION_PROMPT,
            max_tokens=2048
        )

        # Parse JSON from VLM response
        return extract_json_from_text(response)

What GAIA handles for you:

  • Model loading and caching
  • Connection to Lemonade Server
  • Error recovery and retry logic
  • Image token counting
  • JSON extraction from free-form text
%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
flowchart LR
    A[/"Intake Form"/] --> B["VLMClient.process()"]
    B <-->|"REST API"| C["Lemonade Server<br/>Qwen3-VL-4B"]
    C --> D["Structured JSON"]

    style A fill:#f8f9fa,stroke:#dee2e6,stroke-width:2px,color:#495057
    style B fill:#ED1C24,stroke:#C8171E,stroke-width:3px,color:#fff
    style C fill:#ED1C24,stroke:#C8171E,stroke-width:2px,color:#fff
    style D fill:#28a745,stroke:#1e7e34,stroke-width:2px,color:#fff

    linkStyle 0,1,2 stroke:#ED1C24,stroke-width:2px
Loading

The Extraction Prompt (Excerpt)

You are a medical data extraction system. Extract ALL patient
information from this intake form image.

Return a JSON object with ALL fields you can extract:

REQUIRED:
- "first_name": patient's first name
- "last_name": patient's last name

PATIENT INFO:
- "date_of_birth": YYYY-MM-DD format
- "gender": Male/Female/Other
- "ssn": Social Security Number (XXX-XX-XXXX)
- "phone", "mobile_phone", "email"
- "address", "city", "state", "zip_code"

EMERGENCY CONTACT:
- "emergency_contact_name"
- "emergency_contact_relationship"
- "emergency_contact_phone"

PRIMARY INSURANCE:
- "insurance_provider": insurance company name
- "insurance_id": policy number
- "insurance_group_number"

MEDICAL HISTORY:
- "reason_for_visit": chief complaint
- "allergies": known allergies
- "medications": current medications

IMPORTANT:
- Return ONLY the JSON object, no other text
- Use null for fields that exist but are blank
- Dates must be in YYYY-MM-DD format

Example VLM Output

{
  "first_name": "Alice",
  "last_name": "Williams",
  "date_of_birth": "1980-04-04",
  "gender": "Female",
  "phone": "(411) 413-1234",
  "mobile_phone": "(411) 555-7890",
  "email": "alice.williams@hotmail.com",
  "address": "123 Oak Street",
  "city": "Springfield",
  "state": "IL",
  "zip_code": "62701",
  "emergency_contact_name": "Robert Williams",
  "emergency_contact_relationship": "Spouse",
  "emergency_contact_phone": "(411) 413-5678",
  "insurance_provider": "Medicaid Demo",
  "insurance_id": "MCD-2024-87654",
  "insurance_group_number": "GRP-IL-001",
  "reason_for_visit": "Lower back pain for 3 weeks",
  "allergies": "Penicillin, Sulfa drugs",
  "medications": "Lisinopril 10mg daily, Metformin 500mg",
  "marital_status": "Married",
  "employment_status": "Employed",
  "employer": "Springfield Elementary School",
  "signature_date": "2024-01-15"
}

34 fields extracted in 14.2 seconds from a single scanned image.

What Makes VLM Different from OCR?

Traditional OCR Vision Language Model
Extracts text blobs Extracts structured data
Needs template zones Handles any form layout
Fails on handwriting Interprets context
No semantic understanding Knows "DOB" = date of birth

Database Schema: Flexible by Design

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
erDiagram
    PATIENTS ||--o{ ALERTS : has
    PATIENTS ||--o{ INTAKE_SESSIONS : has

    PATIENTS {
        int id PK
        text first_name
        text last_name
        text date_of_birth
        text phone
        text email
        text insurance_provider
        text allergies
        text medications
        text additional_fields
        blob file_content
        text file_hash UK
    }

    ALERTS {
        int id PK
        int patient_id FK
        text alert_type
        text message
        int acknowledged
    }

    INTAKE_SESSIONS {
        int id PK
        int patient_id FK
        int is_new_patient
        real processing_time_seconds
        text changes_detected
    }
Loading

Returning Patient Detection: Simple SQL

No embeddings needed - straightforward SQL matching works reliably.

def _find_existing_patient(self, data: dict) -> Optional[dict]:
    """Check if patient already exists via DatabaseMixin."""

    # Strategy 1: Match on name + DOB (most reliable)
    if data.get("date_of_birth"):
        results = self.query(
            """SELECT * FROM patients
               WHERE first_name = :fn AND last_name = :ln
               AND date_of_birth = :dob
               ORDER BY created_at DESC LIMIT 1""",
            {"fn": data["first_name"],
             "ln": data["last_name"],
             "dob": data["date_of_birth"]}
        )
        if results:
            return results[0]

    # Strategy 2: Fallback to name-only
    results = self.query(
        "SELECT * FROM patients WHERE first_name = :fn AND last_name = :ln",
        {"fn": data["first_name"], "ln": data["last_name"]}
    )
    return results[0] if results else None

Detects changes automatically:

  • Insurance updates
  • New medications
  • Address changes
  • Phone number updates

Tool Calling: Exposing Agent Functions to LLM

The @tool Decorator Pattern

from gaia.agents.base.tools import tool

class MedicalIntakeAgent(Agent, DatabaseMixin):

    @tool
    def search_patients(self, query: str, limit: int = 10) -> dict:
        """
        Search for patients by name, DOB, or other fields.

        Args:
            query: Search query (name, DOB, allergies, etc.)
            limit: Maximum results to return

        Returns:
            Dict with matching patients
        """
        results = self.query(
            """SELECT id, first_name, last_name, date_of_birth,
                      allergies, medications
               FROM patients
               WHERE first_name LIKE :q OR last_name LIKE :q
                  OR allergies LIKE :q
               LIMIT :limit""",
            {"q": f"%{query}%", "limit": limit}
        )
        return {"count": len(results), "patients": results}

    @tool
    def get_patient(self, patient_id: int) -> dict:
        """Get full patient details by ID."""
        results = self.query(
            "SELECT * FROM patients WHERE id = :id",
            {"id": patient_id}
        )
        return results[0] if results else None

    @tool
    def get_intake_stats(self) -> dict:
        """Get processing statistics and time savings."""
        return {
            "total_patients": self._stats["total_patients"],
            "time_saved_hours": self._stats["total_time_saved"] / 3600
        }

How It Works

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
sequenceDiagram
    participant User
    participant Agent as MedicalIntakeAgent
    participant LLM as Lemonade (Qwen3-Coder)
    participant Tool as @tool search_patients()
    participant DB as DatabaseMixin

    User->>Agent: "Which patients have allergies?"
    Agent->>LLM: Send query + tool schemas
    LLM->>Agent: tool_call: search_patients(query="allergy")
    Agent->>Tool: Execute function
    Tool->>DB: Parameterized SQL query
    DB-->>Tool: [Alice, Bob, Charlie]
    Tool-->>Agent: Formatted results
    Agent->>LLM: Tool results
    LLM-->>User: "Found 3 patients with allergies..."
Loading

Safety: LLM never writes raw SQL. It calls predefined tools with validated parameters.


Demo: Live Walkthrough

Step 1: Launch the Dashboard

# Initialize models (first time only)
gaia-emr init

# Start the dashboard
gaia-emr dashboard

Step 2: Drop a Form

  1. Open the watch folder (./intake_forms/)
  2. Drag a scanned intake form (PNG, JPG, or PDF)
  3. Watch the live feed update in real-time

Step 3: Explore the Data

  • Click on the patient in the live feed
  • View all extracted fields
  • Check alerts for allergies or missing fields

Step 4: Query with Natural Language

> Which patients were processed today?
> Show me patients with penicillin allergies
> Summarize the intake forms from this morning

Video Demo

Watch the EMR Agent in action - from file drop to extracted patient data in under 20 seconds:

<video
controls
className="w-full aspect-video"
src="https://assets.amd-gaia.ai/videos/gaia-emr-agent-demo.webm"

Full end-to-end demonstration showing real-time processing, dashboard updates, and patient record extraction.


The Complete Agent: Putting It Together

from gaia.agents.base import Agent
from gaia.agents.base.tools import tool
from gaia.database import DatabaseMixin
from gaia.utils import FileWatcherMixin
from gaia.llm.vlm_client import VLMClient

class MedicalIntakeAgent(Agent, DatabaseMixin, FileWatcherMixin):
    """Production-grade medical intake automation in ~500 lines."""

    def __init__(self, watch_dir, db_path, vlm_model):
        super().__init__()

        # Initialize database with schema
        self.init_database(db_path, schema=PATIENT_SCHEMA)

        # Start file watching
        self.start_watching(watch_dir, patterns=["*.png", "*.jpg", "*.pdf"])

        # Lazy load VLM
        self._vlm = None
        self._vlm_model = vlm_model

    def _on_file_created(self, path):
        """Auto-triggered by FileWatcherMixin."""
        patient_data = self._process_intake_form(path)
        self._store_patient(patient_data)

    @tool
    def search_patients(self, query: str) -> dict:
        """Exposed to LLM for natural language queries."""
        return self.query("SELECT * FROM patients WHERE ...")

# That's the entire structure. GAIA provides the rest.

Lines of code comparison:

Component DIY Implementation With GAIA SDK
File watching & retry ~150 lines 1 line (inherit mixin)
Database connection ~100 lines 1 line (inherit mixin)
VLM integration ~200 lines ~20 lines (use VLMClient)
Tool registration ~50 lines per tool 1 decorator per tool
Total boilerplate ~500+ lines ~50 lines

Time Savings: Development + Operational ROI

%%{init: {'theme':'base', 'themeVariables': { 'primaryColor':'#ED1C24', 'primaryTextColor':'#fff', 'primaryBorderColor':'#C8171E', 'lineColor':'#F4484D', 'fontFamily': 'system-ui, -apple-system, sans-serif'}}}%%
pie showData
    title "Time per Form"
    "Manual Entry" : 600
    "AI Agent" : 15
Loading

Operational ROI

Metric Value
Forms processed per day 15
Manual time per form 10 min
Agent time per form 15 sec
Daily savings ~2.5 hours
Monthly savings ~50 hours
Annual cost savings ~$52,000

Development ROI

Metric DIY With GAIA SDK
Development time 2-3 weeks 1-2 days
Lines of code ~2,000+ ~500
Production readiness Months of testing Built-in patterns
Maintenance burden High (custom framework) Low (SDK updates)

The GAIA + Lemonade Stack

Why Local AI Matters

Factor Cloud API GAIA + Lemonade (Local)
Privacy Data leaves device Data stays on device
Latency 500ms+ network RTT Sub-100ms inference
Cost per form $0.05-0.15 $0.00
Offline capability Requires internet Works offline
HIPAA compliance Complex BAA required Simplified (data never leaves)
Scalability Pay per request Fixed hardware cost

The AMD Advantage

Lemonade Server = AMD's local inference engine optimized for Ryzen AI

  • NPU Acceleration: Dedicated neural processor for efficient VLM/LLM inference
  • Unified Memory: Fast model loading, no CPU↔GPU memory transfers
  • Power Efficiency: Runs on laptop power (15-45W), not datacenter GPUs (300W+)
  • Multi-Model: Run VLM + LLM + TTS/ASR simultaneously
  • OpenAI-Compatible: Drop-in replacement for cloud APIs

GAIA SDK = Framework for building agents on top of Lemonade

  • Pre-built mixins (Database, FileWatcher, RAG, etc.)
  • Tool decorator for LLM function calling
  • Agent orchestration patterns
  • Production-ready error handling

Beyond EMR: The Same Patterns Apply

GAIA SDK Component Library

# Mix and match components for your use case

from gaia.agents.base import Agent
from gaia.agents.base.tools import tool

# Mixins (add capabilities via inheritance)
from gaia.database import DatabaseMixin
from gaia.utils import FileWatcherMixin
from gaia.rag import RAGToolsMixin
from gaia.shell import CLIToolsMixin

# Clients (call AI models)
from gaia.llm import LLMClient
from gaia.llm.vlm_client import VLMClient
from gaia.audio import WhisperASR, KokoroTTS

Example: Other Agent Patterns

Use Case Mixins Tools Models
Document Q&A Agent + RAGToolsMixin @tool search_docs() LLM + Embeddings
Code Generation Agent + CLIToolsMixin @tool run_tests() LLM
Voice Assistant Agent + DatabaseMixin @tool set_reminder() LLM + ASR + TTS
Jira Automation Agent + WebToolsMixin @tool create_issue() LLM
3D Scene Gen Agent + FileWatcherMixin @tool render_scene() LLM + Blender API

Same SDK, endless applications.


Why Choose GAIA SDK?

The Value Proposition

  1. Rapid Development

    • Production-grade agents in hours, not weeks
    • Pre-built mixins eliminate boilerplate
    • Focus on business logic, not infrastructure
  2. Best Practices Built-In

    • Error handling with retry logic
    • Thread-safe operations
    • SQL injection prevention
    • Token budget management
  3. Local-First Architecture

    • Complete data privacy
    • Sub-100ms latency
    • Zero per-request costs
    • Works offline
  4. AMD Hardware Optimized

    • NPU/GPU acceleration via Lemonade
    • Efficient memory usage
    • Multi-model orchestration
  5. Extensible & Composable

    • Mix and match mixins
    • Build custom tools
    • Integrate existing APIs

Key Takeaways

What We Demonstrated

  1. Agent Pattern

    • Inherit from Agent base class
    • Add mixins for capabilities (Database, FileWatcher, RAG, etc.)
    • Focus on business logic, not infrastructure
  2. FileWatcherMixin

    • Automatic file monitoring with retry logic
    • Built-in debouncing and file locking handling
    • Event-driven architecture (no polling)
  3. DatabaseMixin

    • Connection pooling and thread safety
    • Parameterized queries (SQL injection safe)
    • Schema initialization and migrations
  4. @tool Decorator

    • Expose Python functions to LLM
    • Auto-generate JSON schemas from type hints
    • Enable natural language interfaces
  5. VLMClient

    • Unified API for vision models
    • Automatic token management
    • Connection to local Lemonade Server

The GAIA Value Proposition

Build production-grade AI agents in hours, not weeks:

  • ✅ 10x less boilerplate code
  • ✅ Best practices built-in
  • ✅ Local-first architecture (privacy, latency, cost)
  • ✅ AMD hardware optimized
  • ✅ Extensible and composable

Getting Started with GAIA SDK

Installation

# Install GAIA SDK
pip install amd-gaia

# Install Lemonade Server
pip install lemonade-server

# Download models for this demo
gaia-emr init

Build Your First Agent (5 Steps)

# 1. Import GAIA components
from gaia.agents.base import Agent
from gaia.agents.base.tools import tool
from gaia.database import DatabaseMixin

# 2. Define your agent class
class MyAgent(Agent, DatabaseMixin):
    def __init__(self):
        super().__init__()
        self.init_database("my_data.db", schema=MY_SCHEMA)

    # 3. Add tools with the decorator
    @tool
    def my_tool(self, param: str) -> dict:
        """Your custom business logic."""
        return self.query("SELECT ...")

# 4. Instantiate and run
agent = MyAgent()

# 5. Query with natural language
response = agent.process_query("Find all records from today")

Q&A

SDK Questions

Q: Can I use other models besides Qwen?
A: Yes - GAIA works with any GGUF model via Lemonade, or cloud APIs (OpenAI, Anthropic, etc.)

Q: Does it work on non-AMD hardware?
A: Yes - CPU-only mode works everywhere. AMD NPU/GPU provides 5-10x speedup.

Q: What's the learning curve?
A: If you know Python and basic SQL, you can build agents in an afternoon.

Q: Can I deploy this in production?
A: Yes - includes Docker support, API server, monitoring, and audit trails.

EMR-Specific Questions

Q: What form layouts does it support?
A: Any medical intake form. VLMs understand context, not templates.

Q: How do I integrate with Epic/Cerner?
A: Use the REST API or sync the SQLite database via your EMR's integration API.


Resources

Get Started with GAIA SDK

EMR Agent Specifics

Installation

pip install amd-gaia lemonade-server

Community


License: MIT · Copyright (C) 2024-2025 Advanced Micro Devices, Inc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions