A quantitative sports betting analytics system that tracks and calculates Closing Line Value (CLV) to identify profitable betting opportunities. Built to demonstrate systematic edge detection and data-driven decision making in sports markets.
Closing Line Value measures whether you're getting better odds than the market's final assessment. If you consistently beat the closing line, you have an edge. It's the single most reliable indicator of long-term profitability in sports betting - think of it as alpha in traditional markets.
Example:
- You bet Lakers at +200 (3.0 decimal) at 10am
- Line closes at +180 (2.8 decimal)
- Your CLV: +2.38% - you beat the market
Do this consistently across hundreds of bets, and you're printing money.
This platform automates the entire workflow:
- Data Collection - Polls The Odds API twice daily (opening + closing lines)
- Storage - PostgreSQL time-series database captures odds movement
- Analysis - Calculates CLV by comparing entry vs closing odds
- Visualization - Real-time dashboard shows performance metrics
The goal: prove you can identify +EV opportunities before the market corrects.
Backend:
- FastAPI (Python) - API endpoints for CLV calculations
- SQLAlchemy 2.0 - ORM with proper indexing for time-series queries
- PostgreSQL - JSONB for flexible odds storage
- Alembic - Database migrations
- XGBoost + scikit-learn - Ensemble ML for line movement prediction
Frontend:
- React + TypeScript - Type-safe component architecture
- Vite - Fast build tooling
- Recharts - Data visualization
- Tailwind CSS - Styling
Infrastructure:
- Docker Compose - Local PostgreSQL instance
- macOS launchd - Scheduled collection (10am/6pm ET)
- Poetry - Python dependency management
# Setup
docker-compose up -d
alembic upgrade head
cp .env.example .env
# Add your ODDS_API_KEY
# Install dependencies
poetry install
# Train ML model (optional, enables predictions)
poetry run python -m scripts.train_movement_model
# Start backend + frontend
./start.shDashboard: http://localhost:5173 API Docs: http://localhost:8000/docs
Scheduled jobs collect NBA odds from 4 major sportsbooks (Pinnacle, FanDuel, DraftKings, theScore Bet) at opening and closing. Dynamic scheduler creates launchd batches 30 minutes before each game to capture closing lines. Handles rate limiting, retries, and stores 3 market types (moneyline, spreads, totals).
Converts decimal odds to implied probabilities, compares entry vs closing, outputs percentage edge. Aggregates by bookmaker and market type to identify the sharpest books.
Ensemble model (XGBoost + Random Forest) predicts line movement direction and magnitude. Identifies +EV opportunities by forecasting which lines will move favorably before games start.
Model Performance:
- Direction Accuracy: 65.1% (vs 40.7% baseline)
- MAE: 0.0496 (price movement prediction)
- R² Score: 0.58
- Improvement vs Baseline: 22.9%
Top Features:
- Consensus line (13.9%)
- Opening price (12.1%)
- Line spread (8.4%)
- Distance from consensus (6.2%)
- Hours to game (4.7%)
API Endpoints:
GET /api/ml/stats- Model performance metricsGET /api/ml/predictions/{game_id}- Predicted vs actual closing linesPOST /api/ml/retrain- Retrain with latest dataGET /api/ml/feature-importance- Feature importance rankings
Use predictions to identify +EV opportunities where the model predicts favorable line movement.
- Mean CLV - Overall edge across all analyzed bets
- Positive CLV % - Hit rate on beating the closing line
- Trend Analysis - CLV over time to spot improving/declining performance
- Bookmaker Comparison - Which books consistently offer the best prices
- ML Model Metrics - Real-time model performance and feature importance
Designed for analytical queries:
- Composite indexes on (game_id, timestamp) for fast time-series lookups
- JSONB outcomes field for flexible market types
- Proper foreign keys with cascades
- Closing lines tracked separately for clean CLV calculation
# Calculate CLV for a bet
from src.analyzers.clv_calculator import CLVCalculator
calc = CLVCalculator()
clv = calc.calculate_clv(entry_odds=2.1, closing_odds=1.95)
# Returns: +3.66%
# Predict line movement direction
from src.analyzers.movement_predictor import LineMovementPredictor
predictor = LineMovementPredictor()
predictor.load_model("models/line_movement_predictor.pkl")
# Get prediction for current odds
direction, magnitude, confidence = predictor.predict(features)
# Returns: ("UP", 0.15, 0.73) - line moving up 15% with 73% confidenceMost retail bettors chase wins. Sharp bettors chase CLV.
This system proves I understand:
- Market efficiency - Closing lines are the best estimate of true odds
- Edge detection - Consistent +CLV = long-term profit
- Machine learning - Ensemble models predicting line movement (65% accuracy, 23% improvement over baseline)
- Data engineering - Proper schema design for analytical workloads
- Systematic thinking - Automation > manual processes
Same principles apply whether you're trading SPY options or betting NBA spreads. The ML model demonstrates quantitative prediction skills applicable to any financial market.
src/
├── api/ # FastAPI endpoints
│ ├── main.py # Main API app
│ └── ml_endpoints.py # ML prediction endpoints
├── analyzers/ # Analytics & ML
│ ├── clv_calculator.py # CLV calculation logic
│ ├── features.py # ML feature engineering
│ ├── movement_predictor.py # Ensemble line movement model
│ └── bet_settlement.py # Bet outcome tracking
├── collectors/ # Data collection
│ ├── odds_api_client.py # The Odds API client
│ ├── odds_proccessor.py # Data processing & storage
│ └── nba_scores_client.py # NBA.com scores fetcher
├── models/ # SQLAlchemy database models
└── utils/ # Utilities (notifications, etc.)
frontend/
└── src/
└── Dashboard.tsx # React analytics dashboard w/ ML metrics
scripts/
├── collect_odds.py # Odds collection (opening + closing)
├── schedule_game_batches.py # Dynamic launchd scheduler
├── analyze_daily_clv.py # Daily CLV report generation
├── fetch_game_scores.py # NBA score fetching
├── train_movement_model.py # ML model training
├── track_opportunity_performance.py # Bet tracking
└── update_report_profit_stats.py # ROI calculations
models/ # Trained ML models (git-ignored)
└── line_movement_predictor.pkl
migrations/ # Alembic database migrations
launchd/ # macOS scheduling configs
start.sh # Launch backend + frontend
The system generates comprehensive daily reports at 9 AM showing previous day's performance with complete ROI tracking.
Daily Report Includes:
- CLV analysis for all completed games
- Top 10 best betting opportunities by CLV percentage
- Detailed bet tracking with win/loss results and profit calculations
- Performance metrics: win rate, total profit, ROI percentage
- Breakdown by bookmaker and market type
- ML-predicted +EV opportunities for upcoming games
Automation Schedule:
- 2:00 AM - Fetch final game scores
- 3:00 AM - Calculate ROI on tracked bets
- 9:00 AM - Generate daily report with full performance breakdown
- 9:30 AM - Track new opportunities from report
- 10:30 AM - Schedule closing line collection batches
The workflow eliminates manual tracking and provides immediate performance feedback on yesterday's betting opportunities.
- ML model to predict line movement
- Automated daily CLV reports with ROI tracking
- Bet performance tracking and settlement
- Enhanced ML model (XGBoost + Random Forest ensemble)
- Dynamic game-time batch scheduling
- Direct book scraping (Kalshi, Polymarket) for higher frequency data
- Arbitrage opportunity detection
- Kelly Criterion position sizing
- Live odds monitoring with real-time alerts
After collecting data for a few days, you can see which bookmakers consistently offer the best lines and optimal bet timing. The data speaks for itself - some books are beatable, most aren't.
Built as a learning project to explore sports betting analytics and quantitative edge detection. Not financial advice. Bet responsibly.
Stack: Python • TypeScript • React • PostgreSQL • FastAPI • Docker Concepts: Time-series data • Market efficiency • Statistical edge • Automated systems