AI Agents Architecture

This document describes the AI agents implemented in the Receipt Analyzer application.

Current Agents

1. Receipt Analyzer Agent

File: agent/receipt_analyzer_agent.py Type: LangGraph workflow with Azure OpenAI integration Purpose: Analyzes receipt images and extracts structured expense data

Workflow Structure

graph TD
    A[analyze_image_node] --> B[categorize_expense_node]
    B --> C[END]

Node Details

analyze_image_node

Input: Base64 encoded receipt image
Processing: Calls Azure OpenAI vision model (o4-mini)
Output: Structured JSON with receipt details
Tools Used: analyze_receipt_image

categorize_expense_node

Input: Extracted receipt information
Processing: AI-powered expense categorization
Output: Business categories, deductibility, tags
Tools Used: categorize_expense

State Management

class ReceiptAnalysisState(TypedDict):
    messages: Annotated[List[BaseMessage], add_messages]
    image_data: str                    # Base64 receipt image
    analysis_result: Dict[str, Any]    # Raw analysis output
    extracted_info: Dict[str, Any]     # Final structured data

Tools

analyze_receipt_image

Converts base64 image to Azure OpenAI vision API call
Extracts: merchant, date, items, amounts, payment method
Returns structured JSON with confidence scores

categorize_expense

Analyzes merchant and items for expense classification
Determines business deductibility
Assigns budget categories and tags

Output Schema

{
  "merchant_name": "Store Name",
  "date": "2024-01-15",
  "time": "14:30",
  "total_amount": 45.67,
  "currency": "USD",
  "tax_amount": 3.50,
  "items": [
    {
      "name": "Item Name",
      "quantity": 2,
      "price": 12.99,
      "total": 25.98
    }
  ],
  "payment_method": "Card",
  "receipt_number": "12345",
  "category": "grocery",
  "categorization": {
    "primary_category": "Groceries",
    "sub_category": "Food & Beverages",
    "business_deductible": false,
    "tax_implications": "Personal expense",
    "budget_category": "Food",
    "tags": ["grocery", "food", "personal"]
  },
  "analysis_timestamp": "2024-01-15T14:30:00",
  "confidence": "high"
}

Agent Integration

FastAPI Integration

The agent is integrated into FastAPI via:

CopilotKit Integration (/copilotkit endpoint)
- Full conversational interface
- Supports chat-based interactions
- Integrated with frontend chat widget
Direct API Endpoint (/analyze-receipt endpoint)
- Direct receipt analysis without chat context
- Used by main receipt analyzer component
- Faster for simple analysis tasks

LangSmith Observability

All agent executions are traced in LangSmith:

Workflow-level tracing: Complete agent execution flow
Tool-level tracing: Individual tool calls and results
Image attachments: Visual receipt images in traces
Performance metrics: Token usage, latency, costs

Usage Patterns

Direct Analysis

result = receipt_analysis_graph.invoke({
    "image_data": base64_image,
    "messages": [],
    "analysis_result": {},
    "extracted_info": {}
})

Chat Integration

receipt_analyzer_agent = LangGraphAgent(
    name="receipt_analyzer",
    description="Analyzes receipt images...",
    graph=receipt_analysis_graph,
)

Planned Agents

2. Browser Automation Agent (Future)

Purpose: Automate expense report submission to accounting systems Capabilities:

Navigate web interfaces
Fill forms with extracted data
Handle multi-step workflows
Screenshot verification

Integration:

Would receive output from Receipt Analyzer Agent
Could be triggered automatically or on-demand
Would provide audit trail of submissions

3. Document Classification Agent (Future)

Purpose: Classify and route different document types Capabilities:

Distinguish receipts from invoices, statements, etc.
Route to appropriate specialized agents
Handle batch document processing

4. Compliance Verification Agent (Future)

Purpose: Verify expense compliance with company policies Capabilities:

Check spending limits
Validate expense categories
Flag policy violations
Generate compliance reports

Agent Development Guidelines

Creating New Agents

Define State Schema: Create TypedDict for agent state
Design Workflow: Plan node structure and data flow
Implement Tools: Create individual @tool functions
Build Graph: Assemble nodes and edges
Add Tracing: Integrate LangSmith observability
Test Integration: Verify with FastAPI endpoints

Best Practices

Atomic Tools: Keep tools focused on single responsibilities
Error Handling: Graceful fallbacks for tool failures
State Management: Clear state transitions between nodes
Observability: Comprehensive tracing and logging
Testing: Unit tests for tools, integration tests for workflows

Azure OpenAI Integration

Model Compatibility: Ensure o4-mini parameter compatibility
Rate Limiting: Implement appropriate retry logic
Cost Optimization: Monitor token usage and costs
Error Recovery: Handle API failures gracefully

LangSmith Integration

Trace Everything: Workflow, tools, and external API calls
Attach Media: Images, documents, and rich content
Tag Runs: Consistent tagging for filtering and analysis
Evaluate Quality: Regular evaluation runs for quality assessment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AI Agents Architecture

Current Agents

1. Receipt Analyzer Agent

Workflow Structure

Node Details

State Management

Tools

Output Schema

Agent Integration

FastAPI Integration

LangSmith Observability

Usage Patterns

Planned Agents

2. Browser Automation Agent (Future)

3. Document Classification Agent (Future)

4. Compliance Verification Agent (Future)

Agent Development Guidelines

Creating New Agents

Best Practices

Azure OpenAI Integration

LangSmith Integration

FilesExpand file tree

AGENTS.md

Latest commit

History

AGENTS.md

File metadata and controls

AI Agents Architecture

Current Agents

1. Receipt Analyzer Agent

Workflow Structure

Node Details

State Management

Tools

Output Schema

Agent Integration

FastAPI Integration

LangSmith Observability

Usage Patterns

Planned Agents

2. Browser Automation Agent (Future)

3. Document Classification Agent (Future)

4. Compliance Verification Agent (Future)

Agent Development Guidelines

Creating New Agents

Best Practices

Azure OpenAI Integration

LangSmith Integration