OEWS Data Agent - Implementation Complete

Multi-agent system for querying OEWS employment data using natural language, built with LangGraph, LangChain, and FastAPI.

Architecture

The system uses a Planner-Executor pattern with specialized sub-agents:

Planner - Creates execution plan from user query (DeepSeek-R1 reasoning model)
Executor - Routes to appropriate agents based on plan
Cortex Researcher (Text2SQL) - Queries OEWS database with secure parameterized queries
Chart Generator - Creates chart specifications for visualizations
Synthesizer - Creates text summaries of findings
Response Formatter - Formats final JSON response for API

Features Implemented

✅ Milestone 1: Foundation

Task 1.2: Schema metadata with LLM-optimized descriptions
Task 1.3: YAML-based LLM configuration (DeepSeek, GPT-4o, Ollama support)
Task 1.4: Multi-provider LLM factory (Azure AI, OpenAI, Anthropic, Ollama)

✅ Milestone 2: Secure Database Tools

Task 2.1: Parameterized query tools with SQL injection protection
- get_schema_info, validate_sql, execute_sql_query
- search_areas, search_occupations

✅ Milestone 5-6: State Management & Workflow

Task 5.1: LangGraph state management with MessagesState
Task 6.3: Executor node with routing logic and replan handling
Task 6.4: Complete workflow assembly with all agents

✅ Milestone 7: FastAPI Application

Task 7.1: Pydantic models and REST endpoints
- /api/v1/query - Process natural language queries
- /api/v1/models - List available LLM models
- /health - Health check endpoint
Task 7.2: Server entry point and startup scripts

✅ Enhancement: Production Readiness

Task 8.1: Large result set handling (>1000 rows auto-summarized)

Quick Start

1. Install Dependencies

pip install -e .

2. Set Environment Variables

# Required for Azure AI models (DeepSeek)
export AZURE_AI_API_KEY="your-key-here"
export AZURE_AI_ENDPOINT="https://your-endpoint.azure.com"

# Or use OpenAI
export OPENAI_API_KEY="your-key-here"

# Database configuration
export DATABASE_ENV="dev"  # or "prod" for Azure SQL
export SQLITE_DB_PATH="data/oews.db"

3. Start the API Server

# Using the startup script
./scripts/start_server.sh

# Or directly with Python
python -m src.main

The API will be available at http://localhost:8000

4. Test the API

# Health check
curl http://localhost:8000/health

# Query example
curl -X POST http://localhost:8000/api/v1/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What are the median salaries for software developers in Seattle?",
    "enable_charts": false
  }'

# List available models
curl http://localhost:8000/api/v1/models

API Documentation

Interactive API docs available at:

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

Project Structure

src/
├── agents/           # LangGraph workflow nodes
│   ├── planner.py           # Plan creation
│   ├── executor.py          # Agent routing
│   ├── text2sql_agent.py    # Database queries
│   ├── chart_generator.py   # Visualization specs
│   ├── response_formatter.py # API output
│   └── state.py             # Workflow state
├── api/              # FastAPI application
│   ├── endpoints.py         # REST endpoints
│   └── models.py            # Pydantic models
├── config/           # Configuration
│   ├── llm_config.py        # Model registry
│   └── llm_factory.py       # LLM instantiation
├── database/         # Data layer
│   ├── connection.py        # DB abstraction
│   └── schema.py            # Schema metadata
├── prompts/          # Prompt templates
│   ├── planner_prompts.py
│   └── executor_prompts.py
├── tools/            # LangChain tools
│   └── database_tools.py    # Secure SQL tools
├── workflow/         # LangGraph assembly
│   └── graph.py             # Workflow graph
└── main.py           # Server entry point

config/
└── llm_models.yaml   # Model configuration

tests/                # Test suite
├── test_database.py
├── test_schema.py
├── test_llm_config.py
├── test_llm_factory.py
└── test_database_tools.py

Security Features

Parameterized Queries: All SQL queries use ? placeholders to prevent SQL injection
SQL Validation: Dangerous operations (DROP, DELETE, etc.) are rejected
Connection Pooling: Production-ready database connection management
API Key Management: Secure environment variable configuration

Configuration

Edit config/llm_models.yaml to:

Add/remove LLM models
Change default reasoning/implementation models
Configure cost tracking and model parameters

Testing

# Run all tests
pytest

# Run specific test file
pytest tests/test_database_tools.py -v

# Run with coverage
pytest --cov=src --cov-report=html

Git History

Recent commits implementing the plan:

aed862f - FastAPI endpoints, server entry point, large result handling
abe032d - Executor node, workflow assembly, agents
138568e - Secure database tools with parameterized queries
44b934f - LLM factory for multi-provider support
c4e790f - YAML-based LLM configuration
ab16ba2 - Schema metadata with security notes
e817980 - Database connection with parameterized queries

Next Steps

Populate Database: Import OEWS data using existing CLI tools
Add Tests: Write integration tests for workflow execution
Web Research: Implement Tavily integration for external data
Frontend: Build Next.js UI consuming the API
Deployment: Deploy to Azure with production SQL database

Architecture Decisions

LangGraph for workflow orchestration (vs. custom state machine)
Planner-Executor pattern for flexible multi-agent routing
Parameterized queries for security (vs. string formatting)
Large result summarization for performance (>1000 rows)
YAML configuration for model flexibility (vs. hardcoded)
ReAct agents for tool-using agents (Text2SQL, Charts)

Model Usage Tracking

The system tracks which models are used for each agent:

Planner: reasoning model (default: DeepSeek-R1)
Agents: implementation model (default: DeepSeek-V3)
Can override per-request via API

Metadata includes:

Models used per agent
Execution time
Plan structure
Number of replans

Performance Optimizations

Workflow Graph Initialization: Created once at startup (lifespan context)
Connection Pooling: Reuses database connections in production
Large Result Handling: Auto-summarizes queries returning >1000 rows
Agent Caching: Agent instances reused across requests

Support

For issues or questions:

Check API documentation at /docs
Review test files for usage examples
See plan analysis in docs/plans/

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
.github/workflows		.github/workflows
config		config
data		data
docs		docs
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.env.template		.env.template
.gitignore		.gitignore
Caddyfile		Caddyfile
Dockerfile		Dockerfile
LICENSE		LICENSE
QUICK_START.md		QUICK_START.md
README.md		README.md
database_schema.sql		database_schema.sql
docker-compose.local.yml		docker-compose.local.yml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OEWS Data Agent - Implementation Complete

Architecture

Features Implemented

✅ Milestone 1: Foundation

✅ Milestone 2: Secure Database Tools

✅ Milestone 5-6: State Management & Workflow

✅ Milestone 7: FastAPI Application

✅ Enhancement: Production Readiness

Quick Start

1. Install Dependencies

2. Set Environment Variables

3. Start the API Server

4. Test the API

API Documentation

Project Structure

Security Features

Configuration

Testing

Git History

Next Steps

Architecture Decisions

Model Usage Tracking

Performance Optimizations

Support

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

varunr89/oews

Folders and files

Latest commit

History

Repository files navigation

OEWS Data Agent - Implementation Complete

Architecture

Features Implemented

✅ Milestone 1: Foundation

✅ Milestone 2: Secure Database Tools

✅ Milestone 5-6: State Management & Workflow

✅ Milestone 7: FastAPI Application

✅ Enhancement: Production Readiness

Quick Start

1. Install Dependencies

2. Set Environment Variables

3. Start the API Server

4. Test the API

API Documentation

Project Structure

Security Features

Configuration

Testing

Git History

Next Steps

Architecture Decisions

Model Usage Tracking

Performance Optimizations

Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages