Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

AI Cost Tracking System

What This Demonstrates

This example shows how to build a comprehensive AI API usage and cost tracking system using AWS DynamoDB, Express.js, and TypeScript. The system provides:

  • Real-time cost tracking for AI API calls
  • Budget monitoring with configurable warning and blocking thresholds
  • RESTful API for usage queries
  • Web dashboard for visual usage monitoring
  • Proxy middleware that logs token usage and costs
  • Support for multiple AI providers (OpenAI, Anthropic, etc.)

Prerequisites

  • Node.js 18+ and npm
  • AWS CLI configured with appropriate credentials
  • AWS CDK CLI installed (npm install -g aws-cdk)
  • An AWS account with DynamoDB permissions

Quick Start

  1. Clone and install dependencies:

    cd 06-cost-tracking
    npm install
  2. Configure environment variables:

    cp .env.example .env
    # Edit .env with your AWS region and settings
  3. Deploy infrastructure (optional - uses DynamoDB):

    cd cdk
    cdk bootstrap  # First time only
    cdk deploy
  4. Start the server:

    npm run dev
  5. Test the system:

    # Make a tracked AI request
    curl -X POST http://localhost:3000/v1/chat/completions \
      -H "Content-Type: application/json" \
      -H "X-API-Key: test-key" \
      -d '{"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "Hello!"}]}'
    
    # Check usage
    curl "http://localhost:3000/usage?apiKey=test-key"
    
    # View dashboard
    open http://localhost:3000/usage/dashboard

How It Works

Cost Tracking Middleware

The cost-tracker.ts middleware intercepts API requests and:

  1. Extracts Usage Data: Captures model, input/output tokens from requests and responses
  2. Calculates Costs: Uses the pricing table to estimate costs per request
  3. Stores Records: Saves usage data to DynamoDB (or in-memory for local dev)
  4. Enforces Budgets: Blocks requests when monthly limits are exceeded

Pricing Engine

The pricing.ts module contains cost-per-token data for popular AI models:

  • OpenAI GPT-4, GPT-4 Turbo, GPT-3.5 Turbo
  • Anthropic Claude 3 (Opus, Sonnet, Haiku)
  • Automatic fallback pricing for unknown models

API Endpoints

  • POST /v1/chat/completions - Proxy endpoint that tracks usage
  • GET /usage?apiKey=xxx - Returns JSON usage data for an API key
  • GET /usage/dashboard - Interactive HTML dashboard

Infrastructure

The CDK stack creates:

  • DynamoDB table with partition key (apiKey) and sort key (timestamp)
  • Global Secondary Index for efficient monthly queries
  • IAM role with appropriate permissions
  • CloudFormation outputs for easy reference

Limitations

  • Mock AI responses (production would proxy to real providers)
  • Basic in-memory storage fallback (no persistence without DynamoDB)
  • Simple budget enforcement (could add more sophisticated rate limiting)
  • Single-region deployment
  • No user authentication or API key validation

Next Steps

  • Add Real AI Integration: Replace mock responses with actual OpenAI/Anthropic API calls
  • Enhanced Authentication: Implement proper API key management and validation
  • Advanced Analytics: Add cost trends, usage patterns, and detailed reporting
  • Multi-Region Support: Replicate data across regions for global deployments
  • Alerting: Send notifications when budgets approach limits
  • Rate Limiting: Add request rate limits in addition to cost limits
  • Data Export: Add CSV/JSON export functionality for usage data