Skip to content

elizaOS/babylon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Babylon - Decentralized Prediction Market MMO

A multiplayer prediction market game with autonomous AI agents and continuous RL training.


🎯 RL Training System

Babylon includes a complete continuous RL training system with two modes:

πŸ–₯️ Local Mode (Free)

Train on your own GPU - perfect for development and testing.

cd python

# Setup (no W&B key needed!)
export DATABASE_URL=postgresql://your-db-url
export TRAIN_RL_LOCAL=true

# Train
python -m src.training.babylon_trainer

Features:

  • βœ… Uses your local GPU (free!)
  • βœ… All data in YOUR PostgreSQL
  • βœ… Local inference serving
  • βœ… Perfect for development

Cost: $0

☁️ Cloud Mode (Serverless)

Train on W&B's managed infrastructure - perfect for production.

cd python

# Setup (with W&B key)
export DATABASE_URL=postgresql://your-db-url
export WANDB_API_KEY=your-wandb-key  # Get from wandb.ai/settings
export TRAIN_RL_LOCAL=true

# Train
python -m src.training.babylon_trainer

Features:

  • βœ… W&B manages all GPUs (no setup!)
  • βœ… All data in YOUR PostgreSQL
  • βœ… W&B hosted inference (automatic!)
  • βœ… Perfect for production

Cost: ~$820-1720/month (vs $7,000+ self-managed)


πŸš€ Quick Start (RL Training)

Step 1: Install Dependencies

cd python
pip install openpipe-art==0.5.1 asyncpg python-dotenv

Step 2: Configure Environment

# Copy template
cp env.template .env

# Edit with your values
nano .env

For Local Mode:

DATABASE_URL=postgresql://your-db-url
TRAIN_RL_LOCAL=true
# That's it! Will use local GPU

For Cloud Mode (add this):

WANDB_API_KEY=your-wandb-key

Step 3: Run Migration

source .env
psql $DATABASE_URL -f migrations/002_add_self_hosted_tables.sql

Step 4: List Ready Windows

MODE=list python -m src.training.babylon_trainer

Output:

Ready windows (3):
  2025-01-15T10:00: 5 agents
  2025-01-15T11:00: 4 agents
  2025-01-15T12:00: 3 agents

Step 5: Train!

MODE=single python -m src.training.babylon_trainer

Local Mode: Trains on your GPU
Cloud Mode: Trains on W&B serverless


πŸ“Š Training Modes Comparison

Feature Local Mode Cloud Mode
GPU Your GPU W&B managed (CoreWeave)
Setup Zero Zero
Cost Free ~$820/month
Training Time Depends on your GPU ~15 min
Inference Local serving W&B hosted
Deployment Manual Automatic
Best For Development Production
Requires DATABASE_URL DATABASE_URL + WANDB_API_KEY

Both modes:

  • βœ… All data in YOUR PostgreSQL
  • βœ… No OpenPipe API dependency
  • βœ… Local heuristic scoring
  • βœ… Automatic inference setup

πŸŽ“ Training Architecture

What Gets Trained

Your AI agents learn from their trading performance:

  1. Agents play β†’ Trajectories logged to PostgreSQL
  2. Time windows β†’ Group agents by hour (fair comparison)
  3. Local scoring β†’ Heuristic scoring (P&L + win rate + activity)
  4. ART training β†’ Fine-tune model with RL
  5. Inference ready β†’ Agents use improved model

Data Flow

TypeScript Agents (MMO)
    ↓ (log trajectories with window_id)
PostgreSQL (YOUR database)
    ↓ (query by window)
Python Trainer
  β”œβ”€ Collect from database
  β”œβ”€ Score locally (no external API)
  └─ Create ART trajectories
    ↓
ART ServerlessBackend
  β”œβ”€ Local Mode: Uses your GPU
  └─ Cloud Mode: W&B manages GPUs on CoreWeave
    ↓
Trained Model
  β”œβ”€ Local Mode: Checkpoint saved locally
  └─ Cloud Mode: W&B hosted inference endpoint
    ↓
Better Agents! 🎯

πŸ”§ Two Trainers Available

Option 1: Simplified Trainer (Recommended for Getting Started)

File: python/src/training/babylon_trainer.py

Features:

  • Follows ART ServerlessBackend pattern
  • Local heuristic scoring (no external API)
  • Automatic local/cloud mode switching
  • All data in YOUR database

Run:

# Local mode (free)
MODE=list python -m src.training.babylon_trainer
MODE=single python -m src.training.babylon_trainer

# Cloud mode (add WANDB_API_KEY)
export WANDB_API_KEY=your-key
MODE=single python -m src.training.babylon_trainer

Option 2: Original Trainer (Recommended for Production)

File: python/src/training/trainer.py

Features:

  • Uses ART's ruler_score_group() for RULER scoring
  • External LLM judges agents (higher quality)
  • Production-tested
  • Complete data bridge integration

Run:

python -m src.training.trainer \
  --min-agents 3 \
  --lookback-hours 48 \
  --model Qwen/Qwen2.5-0.5B-Instruct

πŸ’‘ Switching Between Modes

The beauty of ART's ServerlessBackend is it automatically switches:

To Use Local Mode

# Don't set WANDB_API_KEY
export DATABASE_URL=postgresql://...
export TRAIN_RL_LOCAL=true

python -m src.training.babylon_trainer
# β†’ Uses local GPU automatically

To Use Cloud Mode

# Add WANDB_API_KEY
export DATABASE_URL=postgresql://...
export WANDB_API_KEY=your-key
export TRAIN_RL_LOCAL=true

python -m src.training.babylon_trainer
# β†’ Uses W&B serverless automatically

Same code, automatic switching! ✨


πŸ“š Documentation

Quick Start

Complete Guides

Original Design


πŸ› Troubleshooting

Local Mode

"CUDA out of memory" β†’ Use cloud mode instead:

export WANDB_API_KEY=your-key

"No GPU available" β†’ Install CUDA drivers or use cloud mode

Cloud Mode

"WANDB_API_KEY not set" β†’ Get from: https://wandb.ai/settings

"W&B quota exceeded" β†’ Check your W&B billing/quota settings

Both Modes

"No windows found" β†’ Check database:

SELECT "scenarioId", COUNT(*) FROM trajectories GROUP BY "scenarioId";

"Database connection failed" β†’ Verify DATABASE_URL:

psql $DATABASE_URL -c "SELECT 1"

πŸ’° Cost Breakdown

Local Mode

  • Setup: Your GPU (one-time)
  • Training: Free (uses your GPU)
  • Inference: Free (local serving)
  • Monthly: $0

Best for: Development, testing, learning

Cloud Mode

  • Setup: Zero (W&B manages everything)
  • Training: ~$1-2 per job (~15 min)
    • Daily (24 jobs): ~$24-48
    • Monthly: ~$720
  • Inference: ~$0.001 per request
    • 100k requests: ~$100
    • 1M requests: ~$1,000
  • Monthly Total: ~$820-1720

Best for: Production, scaling, no GPU management

vs Self-Managed Infrastructure

  • GPU Rental: $7,000+/month (24/7 A100s)
  • DevOps: Ongoing work
  • Monitoring: Custom setup
  • Total: $10,000+/month

Savings with Cloud Mode: 75-85%!


🎯 Recommended Path

Phase 1: Development (Local)

# Test locally first (free!)
export DATABASE_URL=postgresql://...
export TRAIN_RL_LOCAL=true

MODE=list python -m src.training.babylon_trainer
MODE=single python -m src.training.babylon_trainer

Goal: Verify data collection, test training flow, iterate quickly

Phase 2: Production (Cloud)

# Scale to cloud when ready
export WANDB_API_KEY=your-key

MODE=single python -m src.training.babylon_trainer

Goal: Production deployment, W&B managed, auto-scaling


πŸ“– Related Documentation


πŸ†˜ Get Help

RL Training Issues: See python/README.md
Database Issues: Check prisma/schema.prisma
W&B Issues: https://docs.wandb.ai/


Ready to train? See python/QUICK_START.md

πŸš€ Both local and cloud modes ready!