This document describes the AI agents implemented in the Receipt Analyzer application.
File: agent/receipt_analyzer_agent.py
Type: LangGraph workflow with Azure OpenAI integration
Purpose: Analyzes receipt images and extracts structured expense data
graph TD
A[analyze_image_node] --> B[categorize_expense_node]
B --> C[END]
analyze_image_node
- Input: Base64 encoded receipt image
- Processing: Calls Azure OpenAI vision model (o4-mini)
- Output: Structured JSON with receipt details
- Tools Used:
analyze_receipt_image
categorize_expense_node
- Input: Extracted receipt information
- Processing: AI-powered expense categorization
- Output: Business categories, deductibility, tags
- Tools Used:
categorize_expense
class ReceiptAnalysisState(TypedDict):
messages: Annotated[List[BaseMessage], add_messages]
image_data: str # Base64 receipt image
analysis_result: Dict[str, Any] # Raw analysis output
extracted_info: Dict[str, Any] # Final structured dataanalyze_receipt_image
- Converts base64 image to Azure OpenAI vision API call
- Extracts: merchant, date, items, amounts, payment method
- Returns structured JSON with confidence scores
categorize_expense
- Analyzes merchant and items for expense classification
- Determines business deductibility
- Assigns budget categories and tags
{
"merchant_name": "Store Name",
"date": "2024-01-15",
"time": "14:30",
"total_amount": 45.67,
"currency": "USD",
"tax_amount": 3.50,
"items": [
{
"name": "Item Name",
"quantity": 2,
"price": 12.99,
"total": 25.98
}
],
"payment_method": "Card",
"receipt_number": "12345",
"category": "grocery",
"categorization": {
"primary_category": "Groceries",
"sub_category": "Food & Beverages",
"business_deductible": false,
"tax_implications": "Personal expense",
"budget_category": "Food",
"tags": ["grocery", "food", "personal"]
},
"analysis_timestamp": "2024-01-15T14:30:00",
"confidence": "high"
}The agent is integrated into FastAPI via:
-
CopilotKit Integration (
/copilotkitendpoint)- Full conversational interface
- Supports chat-based interactions
- Integrated with frontend chat widget
-
Direct API Endpoint (
/analyze-receiptendpoint)- Direct receipt analysis without chat context
- Used by main receipt analyzer component
- Faster for simple analysis tasks
All agent executions are traced in LangSmith:
- Workflow-level tracing: Complete agent execution flow
- Tool-level tracing: Individual tool calls and results
- Image attachments: Visual receipt images in traces
- Performance metrics: Token usage, latency, costs
Direct Analysis
result = receipt_analysis_graph.invoke({
"image_data": base64_image,
"messages": [],
"analysis_result": {},
"extracted_info": {}
})Chat Integration
receipt_analyzer_agent = LangGraphAgent(
name="receipt_analyzer",
description="Analyzes receipt images...",
graph=receipt_analysis_graph,
)Purpose: Automate expense report submission to accounting systems Capabilities:
- Navigate web interfaces
- Fill forms with extracted data
- Handle multi-step workflows
- Screenshot verification
Integration:
- Would receive output from Receipt Analyzer Agent
- Could be triggered automatically or on-demand
- Would provide audit trail of submissions
Purpose: Classify and route different document types Capabilities:
- Distinguish receipts from invoices, statements, etc.
- Route to appropriate specialized agents
- Handle batch document processing
Purpose: Verify expense compliance with company policies Capabilities:
- Check spending limits
- Validate expense categories
- Flag policy violations
- Generate compliance reports
- Define State Schema: Create TypedDict for agent state
- Design Workflow: Plan node structure and data flow
- Implement Tools: Create individual @tool functions
- Build Graph: Assemble nodes and edges
- Add Tracing: Integrate LangSmith observability
- Test Integration: Verify with FastAPI endpoints
- Atomic Tools: Keep tools focused on single responsibilities
- Error Handling: Graceful fallbacks for tool failures
- State Management: Clear state transitions between nodes
- Observability: Comprehensive tracing and logging
- Testing: Unit tests for tools, integration tests for workflows
- Model Compatibility: Ensure o4-mini parameter compatibility
- Rate Limiting: Implement appropriate retry logic
- Cost Optimization: Monitor token usage and costs
- Error Recovery: Handle API failures gracefully
- Trace Everything: Workflow, tools, and external API calls
- Attach Media: Images, documents, and rich content
- Tag Runs: Consistent tagging for filtering and analysis
- Evaluate Quality: Regular evaluation runs for quality assessment