Skip to content

Commit 4734d99

Browse files
committed
feat: integrate ai-cost-optimizer with unified vendor configuration
- Analyzed ai-cost-optimizer repository from ScientiaCapital - Created comprehensive integration plan (INTEGRATION_PLAN_AI_COST_OPTIMIZER.md) - Updated .env.example with unified 3-tier cost optimization strategy - Aligned vendors: Gemini Flash (free), Claude Haiku (mid), RunPod (premium) - Projected 60-65% cost savings ($30/month reduction) - Added Supabase migration strategy for cost tracking - Documented complete implementation roadmap (6 phases, 14-21 hours) - Includes architecture diagrams, API design, and success criteria Integration Strategy: - Tier 1 (70% traffic): Google Gemini Flash - FREE - Tier 2 (25% traffic): Claude Haiku / OpenRouter - $0.13/day - Tier 3 (5% traffic): RunPod Chinese LLMs - $0.05/day - Expected monthly cost: $15-20 vs current $45-50 Next steps: Install dependencies and begin Phase 1 implementation
1 parent aa1c271 commit 4734d99

2 files changed

Lines changed: 632 additions & 16 deletions

File tree

.env.example

Lines changed: 146 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,146 @@
1-
# API Keys (Required to enable respective provider)
2-
ANTHROPIC_API_KEY="your_anthropic_api_key_here" # Required: Format: sk-ant-api03-...
3-
PERPLEXITY_API_KEY="your_perplexity_api_key_here" # Optional: Format: pplx-...
4-
OPENAI_API_KEY="your_openai_api_key_here" # Optional, for OpenAI models. Format: sk-proj-...
5-
GOOGLE_API_KEY="your_google_api_key_here" # Optional, for Google Gemini models.
6-
MISTRAL_API_KEY="your_mistral_key_here" # Optional, for Mistral AI models.
7-
XAI_API_KEY="YOUR_XAI_KEY_HERE" # Optional, for xAI AI models.
8-
GROQ_API_KEY="YOUR_GROQ_KEY_HERE" # Optional, for Groq models.
9-
OPENROUTER_API_KEY="YOUR_OPENROUTER_KEY_HERE" # Optional, for OpenRouter models.
10-
AZURE_OPENAI_API_KEY="your_azure_key_here" # Optional, for Azure OpenAI models (requires endpoint in .taskmaster/config.json).
11-
OLLAMA_API_KEY="your_ollama_api_key_here" # Optional: For remote Ollama servers that require authentication.
12-
GITHUB_API_KEY="your_github_api_key_here" # Optional: For GitHub import/export features. Format: ghp_... or github_pat_...
13-
14-
# Supabase Configuration (Required for authentication)
15-
NEXT_PUBLIC_SUPABASE_URL="your_supabase_project_url_here" # Required: Your Supabase project URL
16-
NEXT_PUBLIC_SUPABASE_ANON_KEY="your_supabase_anon_key_here" # Required: Your Supabase anonymous key
1+
# ============================================================================
2+
# AI DEVELOPMENT COCKPIT - UNIFIED CONFIGURATION
3+
# Integrates: AI Cost Optimizer + LLM Platform + Dual-Domain System
4+
# ============================================================================
5+
6+
# ----------------------------------------------------------------------------
7+
# CORE LLM PROVIDERS (Cost Optimization Tiers)
8+
# ----------------------------------------------------------------------------
9+
10+
# Tier 1: Free/Ultra-Low-Cost (Simple Queries - 70% of traffic)
11+
GOOGLE_API_KEY="your_google_api_key_here" # Required: Gemini Flash (free tier)
12+
# Get key: https://makersuite.google.com/app/apikey
13+
14+
# Tier 2: Mid-Cost (Complex Queries - 25% of traffic)
15+
ANTHROPIC_API_KEY="your_anthropic_api_key_here" # Required: Claude Haiku/Sonnet
16+
# Format: sk-ant-api03-...
17+
OPENROUTER_API_KEY="YOUR_OPENROUTER_KEY_HERE" # Required: Multi-model fallback (40+ models)
18+
# Get key: https://openrouter.ai/keys
19+
20+
# Tier 3: Premium (Specialized Chinese LLMs - 5% of traffic)
21+
RUNPOD_API_KEY="your_runpod_api_key_here" # Required: Qwen, DeepSeek, ChatGLM
22+
# Get key: https://runpod.io/console/user/settings
23+
HUGGINGFACE_API_KEY="your_huggingface_api_key_here" # Required: Model discovery
24+
# Get key: https://huggingface.co/settings/tokens
25+
26+
# Experimental/Optional
27+
CEREBRAS_API_KEY="your_cerebras_api_key_here" # Optional: Ultra-fast inference (experimental)
28+
# Get key: https://cloud.cerebras.ai/
29+
30+
# ----------------------------------------------------------------------------
31+
# ADDITIONAL LLM PROVIDERS (Optional - for research/fallback)
32+
# ----------------------------------------------------------------------------
33+
PERPLEXITY_API_KEY="your_perplexity_api_key_here" # Optional: Research mode with web search
34+
# Format: pplx-...
35+
OPENAI_API_KEY="your_openai_api_key_here" # Optional: GPT-4o, GPT-4o-mini
36+
# Format: sk-proj-...
37+
MISTRAL_API_KEY="your_mistral_key_here" # Optional: Mistral models
38+
GROQ_API_KEY="YOUR_GROQ_KEY_HERE" # Optional: Groq ultra-fast inference
39+
XAI_API_KEY="YOUR_XAI_KEY_HERE" # Optional: xAI Grok models
40+
OLLAMA_API_KEY="your_ollama_api_key_here" # Optional: Local Ollama authentication
41+
AZURE_OPENAI_API_KEY="your_azure_key_here" # Optional: Azure OpenAI (requires endpoint)
42+
43+
# ----------------------------------------------------------------------------
44+
# AUTHENTICATION & DATABASE
45+
# ----------------------------------------------------------------------------
46+
NEXT_PUBLIC_SUPABASE_URL="your_supabase_project_url_here" # Required: Supabase project URL
47+
# Format: https://xxxxx.supabase.co
48+
NEXT_PUBLIC_SUPABASE_ANON_KEY="your_supabase_anon_key_here" # Required: Supabase anonymous key
49+
SUPABASE_SERVICE_ROLE_KEY="your_service_role_key_here" # Required: Admin operations & migrations
50+
51+
# ----------------------------------------------------------------------------
52+
# COST OPTIMIZER CONFIGURATION (ai-cost-optimizer integration)
53+
# ----------------------------------------------------------------------------
54+
COST_OPTIMIZER_ENABLED="true" # Enable intelligent routing (true/false)
55+
COST_OPTIMIZER_DEFAULT_TIER="auto" # Routing strategy: auto | free | mid | premium
56+
COST_OPTIMIZER_COMPLEXITY_THRESHOLD="100" # Token count for simple vs complex (default: 100)
57+
COST_OPTIMIZER_DAILY_BUDGET="5.00" # Daily budget in USD (alerts when exceeded)
58+
COST_OPTIMIZER_MONTHLY_BUDGET="50.00" # Monthly budget in USD (hard limit)
59+
COST_OPTIMIZER_API_URL="http://localhost:3001" # Internal API endpoint
60+
COST_OPTIMIZER_SAVINGS_TARGET="0.60" # Target 60% cost reduction
61+
62+
# ----------------------------------------------------------------------------
63+
# RUNPOD CONFIGURATION (Serverless Chinese LLMs)
64+
# ----------------------------------------------------------------------------
65+
RUNPOD_API_ENDPOINT="https://api.runpod.io/v2" # RunPod API base URL
66+
RUNPOD_WORKSPACE_ID="your_workspace_id_here" # RunPod workspace identifier
67+
NEXT_PUBLIC_RUNPOD_API_KEY="your_runpod_api_key_here" # Client-side RunPod key (public)
68+
69+
# RunPod vLLM Configuration
70+
RUNPOD_VLLM_ENDPOINT="your_vllm_endpoint_here" # vLLM serverless endpoint ID
71+
RUNPOD_DEFAULT_GPU="NVIDIA_A100" # GPU type for deployments
72+
RUNPOD_MAX_WORKERS="3" # Max concurrent workers
73+
74+
# ----------------------------------------------------------------------------
75+
# MONITORING & ANALYTICS
76+
# ----------------------------------------------------------------------------
77+
PROMETHEUS_PORT="9090" # Prometheus metrics port
78+
GRAFANA_PORT="3000" # Grafana dashboard port
79+
LOG_LEVEL="INFO" # Logging level: DEBUG | INFO | WARN | ERROR
80+
ENABLE_COST_ALERTS="true" # Enable cost threshold alerts (email/webhook)
81+
COST_ALERT_WEBHOOK="your_webhook_url_here" # Optional: Webhook for cost alerts
82+
83+
# OpenTelemetry Configuration
84+
OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4318" # OpenTelemetry collector endpoint
85+
OTEL_SERVICE_NAME="ai-development-cockpit" # Service name for traces
86+
87+
# ----------------------------------------------------------------------------
88+
# ORGANIZATION CONFIGURATION (Dual-Domain Support)
89+
# ----------------------------------------------------------------------------
90+
SWAGGYSTACKS_ORG_ID="swaggystacks" # SwaggyStacks organization (developer theme)
91+
SCIENTIA_ORG_ID="scientia-capital" # Scientia Capital organization (enterprise theme)
92+
93+
# Organization-specific cost budgets
94+
SWAGGYSTACKS_DAILY_BUDGET="2.00" # SwaggyStacks daily budget
95+
SCIENTIA_DAILY_BUDGET="10.00" # Scientia Capital daily budget (higher tier)
96+
97+
# ----------------------------------------------------------------------------
98+
# REDIS CACHE CONFIGURATION (Optional but recommended)
99+
# ----------------------------------------------------------------------------
100+
REDIS_URL="redis://localhost:6379" # Redis connection URL
101+
REDIS_PASSWORD="your_redis_password_here" # Redis password (if required)
102+
REDIS_CACHE_TTL="3600" # Cache TTL in seconds (1 hour)
103+
104+
# ----------------------------------------------------------------------------
105+
# OPTIONAL: DEVELOPMENT & INTEGRATION TOOLS
106+
# ----------------------------------------------------------------------------
107+
GITHUB_API_KEY="your_github_api_key_here" # Optional: GitHub integration & deployments
108+
# Format: ghp_... or github_pat_...
109+
110+
# Next.js Configuration
111+
NEXT_PUBLIC_APP_URL="http://localhost:3001" # Public app URL
112+
NODE_ENV="development" # Environment: development | production | test
113+
114+
# Feature Flags
115+
ENABLE_PWA="true" # Enable Progressive Web App features
116+
ENABLE_WEBSOCKETS="true" # Enable real-time updates via WebSockets
117+
ENABLE_CHINESE_LLMS="true" # Enable Chinese LLM support (Qwen, DeepSeek, etc.)
118+
ENABLE_MARKETPLACE="true" # Enable model marketplace
119+
120+
# ============================================================================
121+
# NOTES:
122+
# ============================================================================
123+
# 1. Cost Optimization Strategy:
124+
# - 70% of queries route to Gemini Flash (FREE)
125+
# - 25% route to Claude Haiku (~$0.13/day)
126+
# - 5% route to RunPod Chinese LLMs (~$0.05/day)
127+
# - Expected monthly cost: $15-20 (vs $45-50 without optimization)
128+
#
129+
# 2. Required API Keys (Minimum to start):
130+
# - GOOGLE_API_KEY (free tier available)
131+
# - ANTHROPIC_API_KEY (required for complex queries)
132+
# - OPENROUTER_API_KEY (fallback routing)
133+
# - NEXT_PUBLIC_SUPABASE_URL + ANON_KEY (authentication)
134+
#
135+
# 3. Optional but Recommended:
136+
# - RUNPOD_API_KEY (for Chinese LLM support)
137+
# - REDIS_URL (for caching and performance)
138+
# - COST_ALERT_WEBHOOK (for budget monitoring)
139+
#
140+
# 4. Security Best Practices:
141+
# - Never commit this file with real keys
142+
# - Use .env.local for actual credentials (gitignored)
143+
# - Rotate keys regularly
144+
# - Use service role keys only server-side
145+
#
146+
# ============================================================================

0 commit comments

Comments
 (0)