This system predicts Fantasy Premier League (FPL) player performance using machine learning models trained on historical data. It also selects an optimal 15-player squad using point predictions, following official FPL rules.
The pipeline performs:
- Training position-specific regression models using historical season data.
- Evaluation using standard metrics (MAE, RMSE, R²).
- Inference to predict points for a given season and generate an optimal squad.
- Separate models for each position (GK, DEF, MID, FWD)
- Model selection per position based on validation performance
- Supported models:
- Linear Regression
- Ridge Regression (with cross-validation)
- Lasso Regression (with cross-validation)
- ElasticNet Regression (with cross-validation)
- Player performance metrics:
- Match statistics (minutes, goals, assists, etc.)
- Bonus point indicators (BPS, influence, creativity, threat)
- Form indicators (rolling averages)
- Team and opposition factors:
- Team strength indicators
- Home/away performance
- Opposition difficulty ratings
- Maximizes predicted total points
- Enforces FPL constraints:
- Budget limit (default: £100m)
- Squad composition:
- 2 Goalkeepers
- 5 Defenders
- 5 Midfielders
- 3 Forwards
- Maximum 3 players per team
- Provides detailed team analysis:
- Position-wise breakdown
- Team distribution
- Value distribution
- Python 3.8+
- All dependencies are listed in
requirements.txt
pandas
numpy
scikit-learn
pyarrow
pulp
git clone https://github.com/SanketAdlak/FPLpredictor.git
python -m venv venv
venv\Scripts\activate # On Windows
# OR
source venv/bin/activate # On macOS/Linux
pip install -r requirements.txt
src/
├── data/
│ └── 2021-22/
│ └── 2022-23/
│ └── 2023-24/
│ └── 2024-25/
│ ├── players_raw.csv
│ └── merged_gw.csv
├── models/ # Auto-created during training
├── processed/ # Auto-created feature storage
├── train.py # Train or fine-tune model
├── eval.py # Evaluate model on test season
├── infer.py # Predict points and select squad
├── requirements.txt
└── README.md
python train.py
Saves:
models/*.joblib
(1 per position)models/config.json
processed/features.parquet
- Pretrained models are included in the repository.
- You can skip train.py and directly run eval.py and infer.py
python eval.py
MAE by position:
GK: 1.894
DEF: 2.182
MID: 1.858
FWD: 2.587
ALL: 2.046
MSE by position:
GK: 6.536
DEF: 7.267
MID: 8.156
FWD: 11.766
ALL: 8.053
R² by position:
GK: 0.090
DEF: 0.001
MID: 0.095
FWD: 0.082
ALL: 0.077
python infer.py data/2024-25/merged_gw.csv --players_raw data/2024-25/players_raw.csv --season 2024-25 --out predictions_2024-25.csv
name | team | position | cost_m | predicted_points | Starting |
---|---|---|---|---|---|
Daniel Muñoz | Crystal Palace | DEF | 5.2 | 5.61 | True |
Diogo Dalot Teixeira | Man Utd | DEF | 5.0 | 4.35 | True |
Marc Cucurella | Chelsea | DEF | 5.4 | 4.68 | True |
Nikola Milenković | Nott'm Forest | DEF | 5.1 | 4.78 | True |
Jørgen Strand Larsen | Wolves | FWD | 5.4 | 4.15 | True |
Dean Henderson | Crystal Palace | GK | 4.6 | 4.16 | True |
Bruno Borges Fernandes | Man Utd | MID | 8.6 | 4.77 | True |
Ismaïla Sarr | Crystal Palace | MID | 5.7 | 4.49 | True |
Jarrod Bowen | West Ham | MID | 7.6 | 4.40 | True |
Mohamed Salah | Liverpool | MID | 13.8 | 4.90 | True |
Morgan Rogers | Aston Villa | MID | 5.6 | 4.57 | True |
Jurriën Timber | Arsenal | DEF | 5.6 | 4.29 | False |
Liam Delap | Ipswich | FWD | 5.6 | 3.68 | False |
Yoane Wissa | Brentford | FWD | 6.6 | 3.82 | False |
Matz Sels | Nott'm Forest | GK | 5.1 | 3.97 | False |
This above output is the prediction for Gameweek 32.
Note: For visualizations and plots (e.g., feature importance, prediction vs. actual), you can use the provided data in a Jupyter Notebook.
The notebook environment is ideal for exploring results interactively.
Before running notebook, uncommnet the packages from requirements.txt and install all of them.
Each season folder (e.g., data/2024-25/
) contains:
name | position | team | now_cost | element_type | ... |
---|---|---|---|---|---|
Haaland E. | FWD | MCI | 125 | 4 | ... |
name | round | minutes | goals_scored | assists | ... |
---|---|---|---|---|---|
Haaland E. | 1 | 90 | 1 | 0 | ... |
- Rolling stats from previous games (e.g., 5-game window)
- Match stats: goals, assists, minutes, clean sheets
- Advanced stats: BPS, influence, creativity, threat
- Context features: home/away, opponent strength
-
Max budget: £100.0m
-
Squad size: 15 players
-
Max 3 players per club
-
Valid formation enforced
- 2 GKs, 5 DEFs, 5 MIDs, 3 FWDs
- At least 3 DEF, 2 MID, 1 FWD in starting XI
-
Model Assumptions:
- Past performance indicates future results
- Linear relationships between features
- Independent feature contributions
-
Data Limitations:
- Limited historical data
- Player transfers between seasons
- Team strategy changes
-
External Factors:
- Injuries and suspensions
- Team formation changes
- Weather conditions
- Fixture-aware prediction
- Transfer suggestion engine
- Player form tracking
- Web-based UI