A machine learning pipeline that scrapes BBC Weather forecasts for London, trains on two years of historical weather data, and produces bias-corrected 7-day predictions with confidence intervals for temperature, precipitation, wind speed, and humidity.
The model is London-only by design — tighter geographic focus means more precise ERA5 training data, a more accurate bias-correction model, and better probability calibration against Polymarket's UK weather markets.
The model gets measurably better over time. Every forecast run accumulates verified predictions and the % average standard error — tracked against both ERA5 actuals and Polymarket resolutions — falls as the model learns from its own mistakes.
cd Polymarket-Weather-Pattern-Predictor
python -m streamlit run dashboard.pyClick ▶ Run Forecast. Done.
git clone https://github.com/darraghmoran2025/Polymarket-Weather-Pattern-Predictor.git
cd Polymarket-Weather-Pattern-Predictor
python -m pip install -r requirements.txt
python -m streamlit run dashboard.pyClick ▶ Run Forecast and wait ~3 minutes for the first download + train. Every run after that takes seconds.
BBC Weather (live scrape — London)
│
▼
Bias-corrected 7-day forecast
+ 90% confidence intervals ◄── trained on ──► ERA5 actuals + NWP forecast history
+ High / Medium / Low confidence (Open-Meteo, free, no API key)
│
▼
Stored in forecast_store ──► verified against ERA5 after target date passes
│
├──► % Avg Std Error (model vs reality — 100-day backtest)
│
└──► % Avg Std Error (model vs Polymarket — per resolved market)
- Collect — downloads ERA5 ground-truth weather and historical NWP forecast data from Open-Meteo, building forecast-vs-actual training pairs at multiple lead times (1, 3, 5, 7, 10, 14 days ahead).
- Train — fits regression models to learn systematic forecast bias, fits error distributions, and configures a 10,000-run Monte Carlo simulator.
- Predict — scrapes the live BBC Weather 14-day forecast, applies bias correction, and outputs calibrated predictions for each of the next 7 days.
- Verify — after each target date passes, the real ERA5 temperature is fetched and compared to the stored prediction. These real errors feed back into the model's uncertainty estimates.
Requirements: Python 3.10+, internet connection.
git clone https://github.com/darraghmoran2025/Polymarket-Weather-Pattern-Predictor.git
cd Polymarket-Weather-Pattern-Predictor
python -m pip install -r requirements.txtWindows: always use
python -m pipinstead ofpip, andpython -m streamlitinstead ofstreamlit.
python -m streamlit run dashboard.pyA browser tab opens at http://localhost:8501.
Click ▶ Run Forecast. London is the fixed location — no city selection needed.
| Stage | What happens | Time |
|---|---|---|
| 📥 Download | Fetches ~2 years of ERA5 + NWP data from Open-Meteo | 2–4 min (first run only) |
| 🧠 Train | Trains regression + Monte Carlo model | ~30 sec (first run only) |
| 🌐 Scrape & Predict | Scrapes BBC Weather, applies bias correction | ~5 sec |
After the first run, data and model are saved locally — all subsequent runs complete in seconds.
Metric strip, four variable charts with shaded 90% confidence intervals, and a colour-coded day-by-day summary table.
Polymarket-style probability bars showing the chance of the actual temperature falling in each range for any selected day. Bucket widths widen with lead time to reflect growing uncertainty. Once the probability calibrator is active, these bars are calibrated against real outcomes.
The full model-optimisation workbench — four sections:
| Section | What it does |
|---|---|
| 100-Day Backtest | Runs the trained model against the last 100 days at every lead time (1, 3, 5, 7, 10, 14 days). Reports % avg standard error per lead day — directly comparable to the confidence intervals shown in the 7-Day Forecast tab |
| Loop 1 — Forecast Verification | Live accuracy stats (RMSE, MAE, CI coverage) from verified past predictions — grows automatically with each forecast run |
| Loop 2 — Probability Calibration | Fits an isotonic regression calibrator on verified errors and/or Polymarket resolution data |
| Polymarket Weather Markets | Fetches resolved weather markets, shows Brier score, and computes % avg standard error between Polymarket probabilities and model-implied probabilities |
The model has three self-optimisation feedback loops that activate automatically as you use it, plus a retrospective 100-day backtest that gives you an immediate accuracy benchmark.
In the Polymarket Data tab, click ▶ Run 100-Day Backtest. The model runs its bias-corrected prediction against every day in the last 100 days of historical training data, at every lead time (1, 3, 5, 7, 10, 14 days ahead), and compares each prediction to the ERA5 actual.
This matters because the 7-Day Forecast tab shows confidence intervals that widen as lead time increases — Day 1 has a tight band, Day 7 has a much wider one. The backtest verifies that this widening is correct: % Avg Standard Error should grow with lead time. If it doesn't, the model is mis-sizing its uncertainty for some horizons.
The dashboard shows:
- Overall % Avg Standard Error across all lead times — your headline accuracy benchmark
- Per-lead breakdown table — % Avg Std Error, MAE, and RMSE for each of Day 1 / 3 / 5 / 7 / 10 / 14, directly comparable to the confidence intervals in the forecast tab
- SE growth bar chart — should be monotonically increasing; a flat or dipping bar flags a miscalibrated horizon
- Lead=1 prediction vs actual line chart — the cleanest signal of raw model accuracy
- Error distribution histogram — ideally symmetric and centred near 0°C; skew indicates residual bias that the next monthly retrain will reduce
Use the overall % Avg Standard Error as a before/after benchmark: run the backtest, note the number, retrain the model after a month of new data, run it again — it should be lower.
Every time you click ▶ Run Forecast, the pipeline:
- Checks if any past predictions are now due for verification (target date has passed)
- Fetches the real ERA5 temperature for those dates from Open-Meteo
- Stores the actual errors in
data/raw/forecast_store.csv
Over time, the Loop 1 section fills in with real RMSE, MAE, and 90% CI coverage. These replace the synthetic noise profiles used during initial training — so the model's uncertainty estimates become progressively more grounded in its actual London track record rather than generic ECMWF noise budgets.
CI coverage is the key metric: a well-calibrated 90% interval should contain the actual value ~90% of the time. If it reads 70%, the model is overconfident. If it reads 98%, it is underconfident. Both tell you the calibrator needs more data.
Once you have ≥10 verified predictions:
- Go to Polymarket Data tab → Loop 2 — Probability Calibration
- Click Update Calibrator from Forecast Store
All probabilities in the Temperature Positions tab are then calibrated using isotonic regression. When the model says "60% chance of 13–15°C", that figure reflects the model's real verified accuracy — not a raw Monte Carlo estimate.
You can also Fetch Polymarket Weather Markets and click Update Calibrator with Polymarket Data to incorporate real market resolution data as a second calibration source.
After fetching Polymarket markets (and having run at least one forecast), the dashboard computes:
- Model-implied probability for each market's temperature threshold, derived from the current forecast's normal distribution N(pred_mean, pred_std)
- Divergence = |Polymarket final probability − Model probability|
- % Average Standard Error = standard error of all divergences, expressed as a percentage
As the calibrator is updated with more verified predictions and Polymarket resolutions, this % standard error figure should fall over time — tracking the convergence between what Polymarket prices and what the model predicts.
The sidebar shows model age in green / amber / red:
| Colour | Age | Action |
|---|---|---|
| 🟢 Green | < 14 days | No action needed |
| 🟡 Amber | 14–30 days | Retrain soon |
| 🔴 Red | > 30 days | Tick Force retrain and run |
When you retrain, the pipeline re-downloads the most recent 2 years of London data from today's date. This ensures the bias-correction model always learns from the latest seasonal patterns.
python src/data_pipeline.pyRuns the full pipeline for London and prints a 7-day table to the terminal.
Options:
| Option | Default | Description |
|---|---|---|
--years |
2 |
Years of historical training data to collect |
--retrain |
off | Re-download data and retrain from scratch |
--collect-only |
off | Only download training data |
--train-only |
off | Only train (requires existing CSV) |
├── dashboard.py # Streamlit dashboard (main entry point)
├── src/
│ ├── data_pipeline.py # Pipeline + CLI + backtest_100days()
│ ├── bbc_scraper.py # BBC Weather scraper (London)
│ ├── historical_collector.py # Open-Meteo ERA5 + NWP collector
│ ├── predictor.py # Main prediction API
│ ├── forecast_store.py # Loop 1 — forecast verification store
│ ├── calibrator.py # Loop 2 — isotonic probability calibrator
│ ├── polymarket_scraper.py # Polymarket resolved market fetcher
│ ├── data_preprocessing.py # CSV loading, validation, feature engineering
│ ├── error_analysis.py # Bias, RMSE, seasonal/lead-time analysis
│ ├── regression_model.py # Regression model for bias correction
│ ├── monte_carlo.py # Monte Carlo confidence interval simulator
│ ├── config.py # Paths and constants
│ └── utils.py # Shared utilities
├── data/
│ └── raw/ # Training CSV + forecast_store.csv saved here
├── models/ # Trained model files + calibrator.pkl saved here
├── tests/
│ └── test_components.py # pytest unit tests
└── requirements.txt
| Source | Used for | API key |
|---|---|---|
| BBC Weather | Live London forecasts | Not required |
| Open-Meteo ERA5 | Historical actuals + verification | Not required |
| Open-Meteo Historical Forecast | Past NWP model output | Not required |
| Polymarket Gamma API | Resolved weather market calibration data | Not required |
pytest tests/- First run: downloading historical data takes 2–4 minutes. Every subsequent run reuses saved files and is near-instant.
- Force retrain: tick in the sidebar Advanced section if you want to re-download data and retrain the model from scratch (e.g. after the model age indicator turns red).
- Forecast length: BBC Weather provides up to 14 days ahead. The dashboard always shows the first 7 days.
- Calibration: the probability calibrator needs ≥10 verified predictions before it activates. Run daily forecasts for 1–2 weeks then click Update Calibrator.
- % Average Standard Error: this is the headline self-improvement metric. Run the 100-day backtest immediately after first training to set a baseline, then compare after each monthly retrain to see it fall.