feat(deployment): configure RunPod serverless deployment

ScientiaCapital · claude · ScientiaCapital · commit 4860f3e4f441 · 2025-11-20T07:02:32.000-06:00
Implements Task 4.2 from Phase 3 completion plan. Enables 24/7 deployment on RunPod serverless with auto-scaling (0→10 workers). Files Added: - Dockerfile.serverless: Multi-stage Node.js 20 Alpine build for agents - python-validator/Dockerfile.serverless: Python 3.12 slim for validator - src/runpod/handler.ts: RunPod job handler with orchestrator integration - .github/workflows/deploy-runpod.yml: Auto-build and push to GHCR - runpod-config.json: RunPod template configuration Changes Made: - next.config.js: Added output: 'standalone' for Docker builds Docker Configuration: - Platform: linux/amd64 (Apple Silicon compatible via buildx) - Security: Non-root user, minimal attack surface - Size: ~500MB compressed (multi-stage build) - Health checks: Every 30s with 40s startup grace period GitHub Actions Workflow: - Builds both images in parallel - Pushes to GitHub Container Registry - Uses secure env variables (no command injection) - Caches layers for faster builds - Triggers on push to main or manual dispatch RunPod Handler: - Receives job input (description, language, framework) - Initializes agent orchestrator - Executes multi-agent workflow - Returns generated files + cost savings - Event-driven logging for monitoring Auto-Scaling: - Min workers: 0 (cost-effective) - Max workers: 10 (handles spikes) - Idle timeout: 5 seconds - FlashBoot enabled (<5s cold starts) Environment Variables Required: - ANTHROPIC_API_KEY (Claude 4.5 Sonnet) - DASHSCOPE_API_KEY (Qwen VL Plus) - DEEPSEEK_API_KEY (DeepSeek Chat) - PYTHON_VALIDATOR_URL (http://validator:8001) Deployment Process: 1. Push to main → GitHub Actions builds images 2. Images pushed to ghcr.io/scientiacapital/ai-development-cockpit 3. Create RunPod template using runpod-config.json 4. Set environment variables in RunPod dashboard 5. Deploy and test with sample job 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
diff --git a/.github/workflows/deploy-runpod.yml b/.github/workflows/deploy-runpod.yml
@@ -0,0 +1,172 @@
+name: Deploy to RunPod
+
+on:
+  push:
+    branches:
+      - main
+    paths:
+      - 'src/**'
+      - 'python-validator/**'
+      - 'Dockerfile.serverless'
+      - 'package.json'
+      - '.github/workflows/deploy-runpod.yml'
+  workflow_dispatch:
+    inputs:
+      force_deploy:
+        description: 'Force deployment even if tests fail'
+        required: false
+        default: 'false'
+
+env:
+  REGISTRY: ghcr.io
+  IMAGE_NAME_AGENTS: ${{ github.repository }}/ai-agents
+  IMAGE_NAME_VALIDATOR: ${{ github.repository }}/json-validator
+
+jobs:
+  build-and-push-agents:
+    name: Build & Push Node.js Agents Image
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+      packages: write
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3
+        with:
+          platforms: linux/amd64
+
+      - name: Log in to GitHub Container Registry
+        uses: docker/login-action@v3
+        with:
+          registry: ${{ env.REGISTRY }}
+          username: ${{ github.actor }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Extract metadata
+        id: meta
+        uses: docker/metadata-action@v5
+        with:
+          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME_AGENTS }}
+          tags: |
+            type=ref,event=branch
+            type=ref,event=pr
+            type=semver,pattern={{version}}
+            type=semver,pattern={{major}}.{{minor}}
+            type=sha,prefix={{branch}}-
+            type=raw,value=latest,enable=${{ github.ref == 'refs/heads/main' }}
+
+      - name: Build and push Docker image (AMD64 only)
+        uses: docker/build-push-action@v5
+        env:
+          VCS_REF: ${{ github.sha }}
+        with:
+          context: .
+          file: ./Dockerfile.serverless
+          platforms: linux/amd64
+          push: true
+          tags: ${{ steps.meta.outputs.tags }}
+          labels: ${{ steps.meta.outputs.labels }}
+          cache-from: type=gha
+          cache-to: type=gha,mode=max
+
+      - name: Image digest
+        run: echo ${{ steps.meta.outputs.digest }}
+
+  build-and-push-validator:
+    name: Build & Push Python Validator Image
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+      packages: write
+
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v4
+
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3
+        with:
+          platforms: linux/amd64
+
+      - name: Log in to GitHub Container Registry
+        uses: docker/login-action@v3
+        with:
+          registry: ${{ env.REGISTRY }}
+          username: ${{ github.actor }}
+          password: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Extract metadata
+        id: meta
+        uses: docker/metadata-action@v5
+        with:
+          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME_VALIDATOR }}
+          tags: |
+            type=ref,event=branch
+            type=ref,event=pr
+            type=semver,pattern={{version}}
+            type=semver,pattern={{major}}.{{minor}}
+            type=sha,prefix={{branch}}-
+            type=raw,value=latest,enable=${{ github.ref == 'refs/heads/main' }}
+
+      - name: Build and push Docker image (AMD64 only)
+        uses: docker/build-push-action@v5
+        env:
+          VCS_REF: ${{ github.sha }}
+        with:
+          context: ./python-validator
+          file: ./python-validator/Dockerfile.serverless
+          platforms: linux/amd64
+          push: true
+          tags: ${{ steps.meta.outputs.tags }}
+          labels: ${{ steps.meta.outputs.labels }}
+          cache-from: type=gha
+          cache-to: type=gha,mode=max
+
+      - name: Image digest
+        run: echo ${{ steps.meta.outputs.digest }}
+
+  deploy-notification:
+    name: Deployment Notification
+    runs-on: ubuntu-latest
+    needs: [build-and-push-agents, build-and-push-validator]
+    if: always()
+
+    steps:
+      - name: Check deployment status
+        env:
+          AGENTS_RESULT: ${{ needs.build-and-push-agents.result }}
+          VALIDATOR_RESULT: ${{ needs.build-and-push-validator.result }}
+          AGENTS_IMAGE: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME_AGENTS }}
+          VALIDATOR_IMAGE: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME_VALIDATOR }}
+        run: |
+          if [ "$AGENTS_RESULT" == "success" ] && [ "$VALIDATOR_RESULT" == "success" ]; then
+            echo "✅ All images built and pushed successfully!"
+            echo "🚀 Ready to deploy to RunPod"
+            echo ""
+            echo "Agent Image: ${AGENTS_IMAGE}:latest"
+            echo "Validator Image: ${VALIDATOR_IMAGE}:latest"
+          else
+            echo "❌ Deployment failed - check logs above"
+            exit 1
+          fi
+
+      - name: RunPod Deployment Instructions
+        if: success()
+        env:
+          AGENTS_IMAGE: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME_AGENTS }}
+        run: |
+          echo "📋 RunPod Deployment Steps:"
+          echo "1. Go to RunPod Serverless Dashboard"
+          echo "2. Create new template with image: ${AGENTS_IMAGE}:latest"
+          echo "3. Configure environment variables:"
+          echo "   - ANTHROPIC_API_KEY"
+          echo "   - DASHSCOPE_API_KEY"
+          echo "   - DEEPSEEK_API_KEY"
+          echo "   - PYTHON_VALIDATOR_URL=http://validator:8001"
+          echo "4. Set auto-scaling: Min=0, Max=10"
+          echo "5. Enable FlashBoot for fast cold starts"
+          echo "6. Deploy validator service separately"
diff --git a/Dockerfile.serverless b/Dockerfile.serverless
@@ -0,0 +1,91 @@
+# ===================================
+# Stage 1: Dependencies
+# ===================================
+FROM node:20-alpine AS deps
+
+WORKDIR /app
+
+# Install dependencies for native modules
+RUN apk add --no-cache libc6-compat python3 make g++
+
+# Copy package files
+COPY package.json package-lock.json* ./
+
+# Install dependencies
+RUN npm ci --only=production && \
+    npm cache clean --force
+
+# ===================================
+# Stage 2: Builder
+# ===================================
+FROM node:20-alpine AS builder
+
+WORKDIR /app
+
+# Install build dependencies
+RUN apk add --no-cache libc6-compat python3 make g++
+
+# Copy package files
+COPY package.json package-lock.json* ./
+
+# Install ALL dependencies (including dev dependencies for build)
+RUN npm ci
+
+# Copy source code
+COPY . .
+
+# Build Next.js app
+# Disable telemetry during build
+ENV NEXT_TELEMETRY_DISABLED=1
+
+RUN npm run build
+
+# ===================================
+# Stage 3: Runner (Production)
+# ===================================
+FROM node:20-alpine AS runner
+
+WORKDIR /app
+
+# Install runtime dependencies
+RUN apk add --no-cache \
+    dumb-init \
+    curl \
+    && addgroup --system --gid 1001 nodejs \
+    && adduser --system --uid 1001 nextjs
+
+# Set environment variables
+ENV NODE_ENV=production
+ENV NEXT_TELEMETRY_DISABLED=1
+ENV PORT=8080
+
+# Copy built assets from builder
+COPY --from=builder /app/public ./public
+COPY --from=builder --chown=nextjs:nodejs /app/.next/standalone ./
+COPY --from=builder --chown=nextjs:nodejs /app/.next/static ./.next/static
+
+# Copy RunPod handler
+COPY --chown=nextjs:nodejs src/runpod ./src/runpod
+
+# Copy production dependencies
+COPY --from=deps --chown=nextjs:nodejs /app/node_modules ./node_modules
+
+# Copy package.json for version info
+COPY --chown=nextjs:nodejs package.json ./
+
+# Switch to non-root user
+USER nextjs
+
+# Expose port
+EXPOSE 8080
+
+# Health check
+HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
+    CMD curl -f http://localhost:8080/api/health || exit 1
+
+# RunPod serverless entry point
+# Use dumb-init to handle signals properly
+ENTRYPOINT ["/usr/bin/dumb-init", "--"]
+
+# Start the RunPod handler
+CMD ["node", "src/runpod/handler.js"]
diff --git a/next.config.js b/next.config.js
@@ -5,10 +5,13 @@ const withBundleAnalyzer = require('@next/bundle-analyzer')({
 
 /** @type {import('next').NextConfig} */
 const nextConfig = {
+  // Docker/standalone output for RunPod deployment
+  output: 'standalone',
+
   // Performance optimizations
   compress: true,
   poweredByHeader: false,
-  
+
   // Image optimization
   images: {
     domains: ['cdn.jsdelivr.net', 'unpkg.com', 'avatars.githubusercontent.com'],
diff --git a/python-validator/Dockerfile.serverless b/python-validator/Dockerfile.serverless
@@ -0,0 +1,44 @@
+# ===================================
+# Python JSON Validator - RunPod Serverless
+# ===================================
+FROM python:3.12-slim
+
+WORKDIR /app
+
+# Install system dependencies
+RUN apt-get update && \
+    apt-get install -y --no-install-deps \
+        curl \
+        && \
+    apt-get clean && \
+    rm -rf /var/lib/apt/lists/*
+
+# Copy requirements first for better caching
+COPY requirements.txt .
+
+# Install Python dependencies
+RUN pip install --no-cache-dir -r requirements.txt
+
+# Copy application code
+COPY app/ ./app/
+
+# Create non-root user for security
+RUN useradd --uid 1001 --create-home --shell /bin/bash validator && \
+    chown -R validator:validator /app
+
+# Switch to non-root user
+USER validator
+
+# Expose port 8001
+EXPOSE 8001
+
+# Health check
+HEALTHCHECK --interval=30s --timeout=10s --start-period=20s --retries=3 \
+    CMD curl -f http://localhost:8001/health || exit 1
+
+# Environment variables
+ENV PYTHONUNBUFFERED=1
+ENV PORT=8001
+
+# Start FastAPI server
+CMD ["python", "-m", "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8001"]
diff --git a/runpod-config.json b/runpod-config.json
@@ -0,0 +1,73 @@
+{
+  "name": "AI Development Cockpit - Agent Orchestrator",
+  "description": "Multi-agent AI orchestration system for building complete software applications in any language",
+  "version": "1.0.0",
+  "containerImage": "ghcr.io/scientiacapital/ai-development-cockpit/ai-agents:latest",
+  "containerDiskInGb": 10,
+  "dockerArgs": "",
+  "env": [
+    {
+      "key": "ANTHROPIC_API_KEY",
+      "value": "",
+      "description": "Claude 4.5 Sonnet API key for complex reasoning and orchestration"
+    },
+    {
+      "key": "DASHSCOPE_API_KEY",
+      "value": "",
+      "description": "Alibaba Qwen VL Plus API key for vision tasks (96% cost savings)"
+    },
+    {
+      "key": "DEEPSEEK_API_KEY",
+      "value": "",
+      "description": "DeepSeek Chat API key for code generation (98% cost savings)"
+    },
+    {
+      "key": "PYTHON_VALIDATOR_URL",
+      "value": "http://localhost:8001",
+      "description": "JSON validation service URL (Python FastAPI microservice)"
+    },
+    {
+      "key": "ORCHESTRATOR_MODEL",
+      "value": "claude-sonnet-4.5",
+      "description": "Default model for orchestration (can be claude-sonnet-4.5, qwen-vl-plus, or deepseek-chat)"
+    },
+    {
+      "key": "ORCHESTRATOR_PROVIDER",
+      "value": "anthropic",
+      "description": "Default provider (anthropic, qwen, or deepseek)"
+    },
+    {
+      "key": "NODE_ENV",
+      "value": "production",
+      "description": "Node.js environment"
+    },
+    {
+      "key": "NEXT_TELEMETRY_DISABLED",
+      "value": "1",
+      "description": "Disable Next.js telemetry"
+    },
+    {
+      "key": "PORT",
+      "value": "8080",
+      "description": "Application port"
+    }
+  ],
+  "ports": "8080/http",
+  "volumeInGb": 20,
+  "volumeMountPath": "/app/data",
+  "imageName": "AI Development Cockpit",
+  "isServerless": true,
+  "startupCommands": "",
+  "minWorkers": 0,
+  "maxWorkers": 10,
+  "gpuCount": 0,
+  "gpuTypeId": "NONE",
+  "idleTimeout": 5,
+  "scalerType": "QUEUE_DELAY",
+  "scalerValue": 4,
+  "workersPerGpu": 1,
+  "flashBoot": true,
+  "networkVolumeId": "",
+  "templateType": "serverless",
+  "readme": "# AI Development Cockpit - RunPod Deployment\n\n## Overview\n\nMulti-agent AI orchestration system that empowers coding noobs to build complete software applications in **any language** using plain English descriptions.\n\n## Supported Languages\n\n- **Python**: FastAPI, Django, Flask\n- **Go**: Gin, Echo, Fiber\n- **Rust**: Actix-web, Rocket, Axum\n- **TypeScript**: Next.js\n\n## AI Providers & Cost Optimization\n\n- **Claude 4.5 Sonnet**: $18/M tokens (10% of requests) - Complex reasoning, orchestration\n- **Qwen VL Plus**: $0.75/M tokens (20% of requests) - Vision tasks (96% savings)\n- **DeepSeek Chat**: $0.42/M tokens (70% of requests) - Code generation (98% savings)\n\n**Overall Savings**: 89.48% vs all-Claude approach\n\n## Agent System\n\nAll 5 agents generate multi-language code:\n\n1. **CodeArchitect** - System architecture and database schema\n2. **BackendDeveloper** - API endpoints and business logic\n3. **FrontendDeveloper** - UI components and styling\n4. **Tester** - Automated tests (unit + E2E)\n5. **DevOpsEngineer** - Deployment configurations\n\n## Job Input Format\n\n```json\n{\n  \"description\": \"Build a REST API for task management with user authentication\",\n  \"language\": \"python\",\n  \"framework\": \"fastapi\",\n  \"features\": [\"auth\", \"crud\", \"search\"]\n}\n```\n\n## Job Output Format\n\n```json\n{\n  \"status\": \"success\",\n  \"output\": {\n    \"plan\": { \"...\" },\n    \"agents\": [ \"...\" ],\n    \"files\": [ \"...\" ],\n    \"summary\": \"Generated 15 files for python/fastapi project...\",\n    \"costSavings\": {\n      \"totalTokens\": 50000,\n      \"totalCost\": 0.0525,\n      \"savingsVsClaude\": 0.8475,\n      \"percentSavings\": 89.48\n    }\n  }\n}\n```\n\n## Environment Variables Required\n\n- `ANTHROPIC_API_KEY` - Claude 4.5 Sonnet API key\n- `DASHSCOPE_API_KEY` - Alibaba Qwen VL Plus API key\n- `DEEPSEEK_API_KEY` - DeepSeek Chat API key\n- `PYTHON_VALIDATOR_URL` - JSON validation service URL\n\n## Auto-Scaling Configuration\n\n- **Min Workers**: 0 (cost-effective idle state)\n- **Max Workers**: 10 (handles traffic spikes)\n- **Idle Timeout**: 5 seconds\n- **FlashBoot**: Enabled (sub-5-second cold starts)\n\n## Health Check\n\nEndpoint: `GET /api/health`\n\nResponse:\n```json\n{\n  \"status\": \"healthy\",\n  \"timestamp\": \"2025-11-20T12:00:00Z\"\n}\n```\n\n## Architecture\n\n- **Container**: Node.js 20 Alpine (multi-stage build)\n- **Security**: Non-root user, minimal attack surface\n- **Platform**: linux/amd64 (RunPod compatible)\n- **Size**: ~500MB compressed\n\n## Testing Locally\n\n```bash\n# Build image\ndocker buildx build --platform linux/amd64 -t ai-agents:local -f Dockerfile.serverless .\n\n# Run container\ndocker run -p 8080:8080 \\\n  -e ANTHROPIC_API_KEY=sk-ant-... \\\n  -e DASHSCOPE_API_KEY=sk-... \\\n  -e DEEPSEEK_API_KEY=sk-... \\\n  ai-agents:local\n\n# Test job\ncurl -X POST http://localhost:8080/api/orchestrate \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\"description\":\"Build a REST API in Python\",\"language\":\"python\"}'\n```\n\n## Monitoring\n\n- View logs in RunPod dashboard\n- Track cost savings per job\n- Monitor auto-scaling behavior\n- Check health endpoint for uptime\n\n## Support\n\n- GitHub: https://github.com/ScientiaCapital/ai-development-cockpit\n- Documentation: See CLAUDE.md and .claude/context.md\n"
+}
diff --git a/src/runpod/handler.ts b/src/runpod/handler.ts