Skip to content

sarviinageelen/polymarket-sports-predictability

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Polymarket Sports Predictability Analysis

Statistical analysis of favorite win rates across sports prediction markets using hybrid API architecture.

Python 3.11+ License: MIT

Overview

This project analyzes sports prediction market efficiency on Polymarket. The primary research question: What is the empirical win rate of favorites across different professional sports?

The analysis processes 10,115 closed betting markets across 10,223 total events in seven sports (ATP, WTA, NBA, NFL, MLB, CFB, CBB), achieving 99% data completeness through a hybrid API integration approach.

Key Results

Sport Favorite Win Rate Sample Size Events Analyzed
College Basketball 72.7% 1,250/1,720 1,720
College Football 72.9% 805/1,105 1,105
ATP Tennis 69.5% 1,234/1,776 1,776
NBA Basketball 67.9% 1,350/1,988 1,988
WTA Tennis 66.7% 12/18 18
NFL Football 66.6% 414/622 622
MLB Baseball 56.5% 1,385/2,450 2,450

Architecture

The system implements a multi-stage data pipeline integrating two Polymarket APIs:

┌──────────────────┐
│   Gamma API      │  Sport-based event filtering via tag IDs
│  (Event Catalog) │  Fetches: event metadata, participants, market structure
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│    CLOB API      │  Token-based pricing enrichment
│  (Order Book)    │  Fetches: closing prices, settlement data, volume
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│ Analysis Engine  │  Win rate calculation and aggregation
│    (Pandas)      │  Logic: identify favorite → validate winner → compute rates
└────────┬─────────┘
         │
         ▼
┌──────────────────┐
│  Excel Output    │  Actionable insights generation
│  (xlsxwriter)    │  Format: 7-tab workbook with analysis
└──────────────────┘

Key Features

  • Multi-sport support: ATP, WTA, NBA, NFL, MLB, CFB, CBB
  • Hybrid API architecture: Combines Gamma API (events) + CLOB API (pricing)
  • 99.7% data completeness: Token ID matching resolves missing price data
  • Async pipeline: Concurrent fetching with aiohttp and rate limiting
  • Error tracking: Comprehensive retry logic with exponential backoff
  • Actionable Excel output: 7-tab workbook with betting insights and ROI analysis

Installation

Prerequisites

  • Python 3.11 or higher
  • pip package manager

Setup

# Clone repository
git clone https://github.com/yourusername/polymarket-sports-predictability.git
cd polymarket-sports-predictability

# Install dependencies
pip install -r requirements.txt

Usage

Step 1: Fetch Sports Metadata

python src/fetch_sports.py

Output: data/fetch_sports.csv (~5 seconds)

Step 2: Fetch Event Data

python src/fetch_events.py

Output: data/fetch_events.csv (60-90 minutes for 10,223 events)

Step 3: Generate Insights

python src/generate_insights.py

Output: outputs/favourite_win_rates.xlsx (7 focused tabs with actionable betting insights)

API Integration

The pipeline uses two Polymarket APIs:

  1. Gamma API (https://gamma-api.polymarket.com/events) - Event discovery via sport tags
  2. CLOB API (https://clob.polymarket.com) - Reliable pricing and settlement data

The hybrid approach resolves Gamma API's 89% missing price data by matching events via condition_id to CLOB market data.

Output Structure

data/
├── fetch_events.csv    # 10,223 events with pricing and settlement
└── fetch_sports.csv    # Sports metadata and tag mappings

outputs/
└── favourite_win_rates.xlsx    # Excel workbook with 7 tabs:
    ├── Index                   # Overview + key takeaways
    ├── Quick Reference         # Top actionable strategies
    ├── Sport Guide             # Which sports to bet
    ├── Underdog Opportunities  # ROI by sport/threshold
    ├── Reliable Favorites      # Best teams when favored
    ├── Market Efficiency       # Calibration analysis
    └── Raw Data                # Summary statistics

Project Structure

polymarket-sports-predictability/
├── README.md
├── LICENSE
├── requirements.txt
├── src/
│   ├── fetch_sports.py      # Sports metadata fetcher
│   ├── fetch_events.py      # Event data pipeline
│   └── generate_insights.py # Betting insights analysis
├── tests/
│   ├── test_fetch_sports.py
│   ├── test_generate_chart.py
│   └── test_integration.py
├── data/                    # Generated datasets
└── outputs/                 # Generated Excel workbook

Disclaimer

This project is for educational and research purposes only. The analysis is based on historical market data and should not be construed as investment advice.

  • Past performance does not guarantee future results
  • Users should comply with all applicable laws and Polymarket's terms of service

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

Analysis of favorite win rates across sports prediction markets using Polymarket API

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages