Skip to content

Latest commit

 

History

History
590 lines (448 loc) · 11.7 KB

File metadata and controls

590 lines (448 loc) · 11.7 KB

🚀 Deployment Guide - Market Firehose System

This guide covers deploying the Market Firehose System in various environments.


Table of Contents

  1. Prerequisites
  2. Local Development
  3. Docker Compose (Recommended)
  4. Production Deployment
  5. Environment Configuration
  6. Monitoring & Maintenance

Prerequisites

Required

  • Python 3.11+
  • PostgreSQL 16+
  • Redis 7+
  • OpenAI API key

Optional

  • Docker & Docker Compose (for containerized deployment)
  • Kubernetes (for cloud deployment)

Local Development

Step 1: Clone and Setup

# Clone repository
git clone https://github.com/your-org/Market-Firehose-System.git
cd Market-Firehose-System

# Create virtual environment
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Step 2: Configure Environment

# Copy environment template
cp env.example .env

# Edit .env with your settings
nano .env  # or your preferred editor

Required settings:

DATABASE_URL=postgresql+asyncpg://postgres:postgres@localhost:5432/market_firehose
REDIS_URL=redis://localhost:6379/0
OPENAI_API_KEY=sk-your-api-key-here

Step 3: Start Dependencies

Option A: Docker (easiest)

docker run -d --name firehose-postgres \
  -e POSTGRES_USER=postgres \
  -e POSTGRES_PASSWORD=postgres \
  -e POSTGRES_DB=market_firehose \
  -p 5432:5432 \
  postgres:16-alpine

docker run -d --name firehose-redis \
  -p 6379:6379 \
  redis:7-alpine

Option B: Local installation

# macOS
brew install postgresql@16 redis
brew services start postgresql@16
brew services start redis

# Ubuntu/Debian
sudo apt install postgresql-16 redis-server
sudo systemctl start postgresql redis

Step 4: Initialize Database

# Run database setup script
python scripts/init_db.py

Step 5: Start the Application

# Terminal 1: Start API server
uvicorn src.main:app --reload --host 0.0.0.0 --port 8000

# Terminal 2: Start worker (optional, for background processing)
python -m src.queue.worker

Step 6: Verify Installation

# Check health
curl http://localhost:8000/health

# Run demo
python scripts/demo.py

# View API docs
open http://localhost:8000/docs

Docker Compose

The easiest way to run the complete stack.

Step 1: Configure

# Copy environment file
cp env.example .env

# Set your OpenAI API key
echo "OPENAI_API_KEY=sk-your-key-here" >> .env

Step 2: Build and Start

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f

# Check status
docker-compose ps

Step 3: Initialize Database

# Run migrations (first time only)
docker-compose exec api python scripts/init_db.py

Step 4: Access Services

Service URL
API http://localhost:8000
API Docs http://localhost:8000/docs
Health Check http://localhost:8000/health
Metrics http://localhost:8000/api/v1/metrics/overview

Managing Docker Compose

# Stop all services
docker-compose down

# Stop and remove volumes (reset data)
docker-compose down -v

# Rebuild after code changes
docker-compose build --no-cache
docker-compose up -d

# Scale workers
docker-compose up -d --scale worker=5

# View specific logs
docker-compose logs -f api
docker-compose logs -f worker

Production Deployment

Option A: Docker Compose on a VPS

  1. Provision a server (DigitalOcean, AWS EC2, etc.)

    • Minimum: 2 CPU, 4GB RAM
    • Recommended: 4 CPU, 8GB RAM
  2. Install Docker

curl -fsSL https://get.docker.com | sh
sudo usermod -aG docker $USER
  1. Clone and configure
git clone https://github.com/your-org/Market-Firehose-System.git
cd Market-Firehose-System
cp env.example .env
# Edit .env with production settings
  1. Production docker-compose.prod.yml
version: "3.8"
services:
  postgres:
    image: postgres:16-alpine
    restart: always
    volumes:
      - postgres_data:/var/lib/postgresql/data
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}

  redis:
    image: redis:7-alpine
    restart: always
    volumes:
      - redis_data:/data

  api:
    build: .
    restart: always
    ports:
      - "8000:8000"
    environment:
      - DATABASE_URL=postgresql+asyncpg://postgres:${DB_PASSWORD}@postgres:5432/market_firehose
      - REDIS_URL=redis://redis:6379/0
      - OPENAI_API_KEY=${OPENAI_API_KEY}
    depends_on:
      - postgres
      - redis

  worker:
    build: .
    restart: always
    command: python -m src.queue.worker
    deploy:
      replicas: 3
    environment:
      - DATABASE_URL=postgresql+asyncpg://postgres:${DB_PASSWORD}@postgres:5432/market_firehose
      - REDIS_URL=redis://redis:6379/0
      - OPENAI_API_KEY=${OPENAI_API_KEY}
    depends_on:
      - postgres
      - redis

volumes:
  postgres_data:
  redis_data:
  1. Start with production config
docker-compose -f docker-compose.prod.yml up -d
  1. Set up reverse proxy (nginx)
server {
    listen 80;
    server_name api.yourdomain.com;

    location / {
        proxy_pass http://localhost:8000;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
}

Option B: Kubernetes Deployment

  1. Create Kubernetes manifests
# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: firehose-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: firehose-api
  template:
    metadata:
      labels:
        app: firehose-api
    spec:
      containers:
        - name: api
          image: your-registry/market-firehose:latest
          ports:
            - containerPort: 8000
          env:
            - name: DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: firehose-secrets
                  key: database-url
            - name: REDIS_URL
              valueFrom:
                secretKeyRef:
                  name: firehose-secrets
                  key: redis-url
            - name: OPENAI_API_KEY
              valueFrom:
                secretKeyRef:
                  name: firehose-secrets
                  key: openai-api-key
          resources:
            requests:
              memory: "512Mi"
              cpu: "250m"
            limits:
              memory: "1Gi"
              cpu: "500m"
          livenessProbe:
            httpGet:
              path: /live
              port: 8000
            initialDelaySeconds: 10
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /ready
              port: 8000
            initialDelaySeconds: 5
            periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: firehose-api
spec:
  selector:
    app: firehose-api
  ports:
    - port: 80
      targetPort: 8000
  type: LoadBalancer
  1. Deploy
kubectl apply -f k8s/

Option C: Cloud Managed Services

AWS Architecture:

  • API: ECS Fargate or EKS
  • Database: RDS PostgreSQL
  • Cache: ElastiCache Redis
  • Load Balancer: ALB

GCP Architecture:

  • API: Cloud Run or GKE
  • Database: Cloud SQL PostgreSQL
  • Cache: Memorystore Redis
  • Load Balancer: Cloud Load Balancing

Environment Configuration

Required Variables

Variable Description Example
DATABASE_URL PostgreSQL connection postgresql+asyncpg://user:pass@host:5432/db
REDIS_URL Redis connection redis://localhost:6379/0
OPENAI_API_KEY OpenAI API key sk-...

Optional Variables

Variable Description Default
OPENAI_MODEL LLM model gpt-4o-mini
WORKER_CONCURRENCY Parallel workers 10
BATCH_SIZE Articles per batch 20
LOG_LEVEL Logging level INFO
API_PORT API port 8000

Production Settings

# Production .env
DATABASE_URL=postgresql+asyncpg://firehose:strong-password@db.example.com:5432/market_firehose
DATABASE_POOL_SIZE=50
DATABASE_MAX_OVERFLOW=20

REDIS_URL=redis://:password@redis.example.com:6379/0

OPENAI_API_KEY=sk-your-production-key
OPENAI_MODEL=gpt-4o-mini
OPENAI_MAX_TOKENS=2000

WORKER_CONCURRENCY=20
BATCH_SIZE=50
MAX_RETRIES=5

LOG_LEVEL=INFO
LOG_FORMAT=json

API_WORKERS=4

Monitoring & Maintenance

Health Checks

# Basic health
curl http://localhost:8000/health

# Detailed health
curl http://localhost:8000/health/detailed

# Kubernetes probes
curl http://localhost:8000/live   # Liveness
curl http://localhost:8000/ready  # Readiness

Metrics Dashboard

Access metrics at /api/v1/metrics/overview:

curl http://localhost:8000/api/v1/metrics/overview | jq

Returns:

{
  "articles": {
    "completed": 1500,
    "pending": 42,
    "processing": 10,
    "failed": 3,
    "last_hour": 87,
    "last_24h": 2100
  },
  "queue": {
    "depth": 42,
    "rate_per_minute": 95.5
  },
  "processing": {
    "p50_ms": 2100,
    "p95_ms": 4500,
    "p99_ms": 6200
  }
}

Database Maintenance

# Backup
docker-compose exec postgres pg_dump -U postgres market_firehose > backup.sql

# Restore
cat backup.sql | docker-compose exec -T postgres psql -U postgres market_firehose

# Vacuum (run weekly)
docker-compose exec postgres psql -U postgres -d market_firehose -c "VACUUM ANALYZE;"

Log Management

# View API logs
docker-compose logs -f api --tail=100

# View worker logs
docker-compose logs -f worker --tail=100

# Export logs
docker-compose logs api > api.log

Scaling

# Scale workers horizontally
docker-compose up -d --scale worker=5

# Kubernetes scaling
kubectl scale deployment firehose-worker --replicas=10

Troubleshooting

Common Issues

1. Database connection refused

# Check PostgreSQL is running
docker-compose ps postgres
# Check connection
docker-compose exec postgres pg_isready

2. Redis connection error

# Check Redis
docker-compose exec redis redis-cli ping

3. OpenAI API errors

# Verify API key
curl https://api.openai.com/v1/models \
  -H "Authorization: Bearer $OPENAI_API_KEY"

4. Worker not processing

# Check queue depth
curl http://localhost:8000/api/v1/metrics/overview | jq '.queue'

# Check worker logs
docker-compose logs worker --tail=50

Performance Tuning

  1. Increase workers for higher throughput
  2. Tune database pool based on connection count
  3. Adjust batch size for LLM efficiency
  4. Enable Redis clustering for high availability

Security Checklist

  • Change default database password
  • Use secrets management for API keys
  • Enable HTTPS with SSL certificate
  • Configure firewall rules
  • Set up API key authentication
  • Enable rate limiting
  • Regular security updates

Support