This repository contains the full machine-learning pipeline used in the thesis:
Small-Data Behavioural Aimbot Detection for Online FPS Security: A Server-Side Machine Learning Approach
Luke Pettigrew, TU Dublin, May 2026 [file:40]
The project implements a server-side behavioural aimbot detector for the open‑source FPS AssaultCube v1.2, using a realistically small, hand‑collected dataset and a like‑for‑like comparison of classical and deep models under the same evaluation protocol. [file:40]
The code here covers:
- Parsing AssaultCube demo files into per‑tick telemetry (via a separately maintained parser)
- Feature engineering for behaviour‑only classical models (no raw map coordinates)
- Unified model comparison across Logistic Regression, Random Forest, XGBoost, a Simple Average ensemble, 1D‑CNN, and LSTM
- Final model training and saving
- Three‑match sliding‑window analysis (toggling, full‑cheat, full‑clean)
- Result plots and saved models used in the thesis [file:37][file:38][file:39][file:40]
.
├── README.md # This file
├── .vscode/settings.json
├── data/ # Parsed demos for three-match analysis
│ ├── demo20260309_2147_local_ac_desert3_10min_DM.json
│ ├── demo20260309_2217_local_ac_lainio_10min_DM.json
│ └── ... (additional parsed matches)
├── models/ # Final coordinate‑removed models used in the thesis
│ ├── lr_model.pkl # Logistic Regression (coordinate‑removed)
│ ├── rf_model.pkl # Random Forest
│ ├── xgb_model.pkl # XGBoost
│ ├── cnn_final.pth # 1D‑CNN
│ ├── lstm_final.pth # LSTM
│ ├── lstm_norm.npz # LSTM normalisation mean/std
│ └── scaler.pkl # StandardScaler for classical features
├── legacy models (xyz inclusive)/
│ ├── lr_model.pkl # Legacy models trained with raw x,y,z
│ ├── rf_model.pkl
│ ├── xgb_model.pkl
│ ├── cnn_final.pth
│ ├── lstm_final.pth
│ ├── lstm_norm.npz
│ └── scaler.pkl
├── results/ # Three‑match analysis plots
│ ├── full_cheat_classical.png
│ ├── full_cheat_classical_xyz_inclusive.png
│ ├── full_cheat_deep.png
│ ├── full_clean_classical.png
│ ├── full_clean_classical_before_xyz_removal.png
│ ├── full_clean_deep.png
│ ├── toggling_classical.png
│ ├── toggling_classical_xyz_inclusive.png
│ └── toggling_deep.png
├── scripts/
│ ├── model_comparison.py # Main training / comparison script
│ ├── three_match_analysis.py # Sliding‑window unseen‑match analysis
│ └── utils.py # Parsing helpers, feature engineering, etc.
├── unparsed/ # Raw AssaultCube .dmo demos and parser
│ ├── DmoParser.py
│ ├── demo20260306_2225_local_ac_desert_10min_DM.dmo
│ ├── demo20260306_2245_local_ac_desert3_10min_DM.dmo
│ ├── ... (additional raw demos)
│ └── parseall.sh # UNIX / WSL only helper to batch‑parse demos
├── model_boxplot_f1.png # Cross‑validation boxplot
├── rf_feature_importance.png # Random Forest feature importance (avg over folds)
├── xgb_feature_importance.png
├── toggling_analysis.png # Older/summary toggling plots (if present)
└── toggling_comparison.png # Older/summary comparison plot (if present)
The dataset summary, feature engineering and evaluation protocol match the description in the thesis (106 usable sequences from 29 matches, coordinate‑removed features, 5‑fold match‑level CV). [file:40]
This project uses Python, PyTorch, scikit‑learn, XGBoost, and standard scientific libraries. [file:37][file:39][file:40]
- Clone the repository:
git clone https://github.com/0xLLUUKKEE/aimbot-detector.git
cd aimbot-detector- Create and activate a virtual environment (recommended):
python -m venv .venv
source .venv/bin/activate # Linux/macOS
# or
.\.venv\Scripts\activate # Windows- Install dependencies (adjust if your actual file is named differently):
pip install -r requirements.txtThe pipeline operates in two stages:
- Raw
.dmodemos recorded from controlled AssaultCube matches live underunparsed/. [file:39][file:40] - Parsed JSON event files live under
data/and are consumed by the training and analysis scripts. [file:37][file:39][file:40]
- AssaultCube v1.2 free‑for‑all 10‑minute matches.
- Three match types: clean, full‑cheat, and mixed. [file:40]
- Each match is stored as
demoYYYYMMDD_HHMM_local_ac_<map>_10min_DM.dmo. DmoParser.pyis a modified parser from the separateassaultcube-server-scriptsrepo; it converts demos into line‑separated Python dict literals. [file:39][file:40]
- Files in
data/are the parsed outputs ofDmoParser.py, one position/chat event per line, stored as Python dict literals with fields such ascn,type,x,y,z,yaw,pitch,shooting, andgametime. [file:39][file:40] - These files are what
scripts/model_comparison.pyandscripts/three_match_analysis.pyactually use. [file:37][file:38][file:39]
If you add new .dmo demos under unparsed/, you can parse them into JSON using DmoParser.py. [file:39][file:40]
In the original workflow, parseall.sh was a WSL‑only helper script that looped over .dmo files and called the parser. It is not required for users on Linux/macOS, and is mainly kept as an example of how the original dataset was built. [file:39][file:40]
A typical manual parse command (adjust to your script’s CLI) might look like:
cd unparsed
python DmoParser.py demo20260306_2225_local_ac_desert_10min_DM.dmo > ../data/demo20260306_2225_local_ac_desert_10min_DM.jsonIf
DmoParser.pyhas a different CLI signature (e.g., input and output arguments), update this section to reflect it.
All feature engineering logic lives in scripts/utils.py. [file:39][file:40]
Key points:
- Per‑tick features (10 dimensions, behaviour‑only):
yaw,pitchdx,dy,dz(movement deltas)dyaw,dpitch(wrapped angle deltas)shooting(0/1)minangletoenemy(degrees)aimcorrectiondelta(degrees) [file:39][file:40]
- No raw x,y,z coordinates are passed into the final models to avoid map‑position bias; they are only used internally to derive movement and aim‑correction features. [file:39][file:40]
For classical models, each variable‑length sequence is summarised into a 77‑dimensional vector built from:
- 7 statistics (mean, std, min, max, median, p5, p95) for each per‑tick feature, plus
- 7 behavioural summary features (aim‑change autocorrelations, fraction of time shooting, number of shooting bursts, average speed, max aim‑correction, fraction of time on‑target). [file:39][file:40]
Deep models (1D‑CNN and LSTM) consume the raw per‑tick sequences (10 channels), with per‑fold normalisation for training and a global normalisation stored in models/lstm_norm.npz for the final LSTM. [file:37][file:39][file:40]
The main script for training and evaluating all models is:
python scripts/model_comparison.pyThis script:
-
Loads and labels sequences
- Defines a
DEMOFILESmapping of demo JSON paths to client numbers and cheater flags (clean, full‑cheat, mixed). [file:37] - Uses
loadevents,buildplayertimelines, andbuildfullsequencefromutils.pyto turn per‑tick events into per‑player sequences and labels. [file:37][file:39]
- Defines a
-
Builds feature representations
X_stat: 77‑dimensional statistical features for classical models.X_raw: variable‑length 10‑dimensional sequences for the LSTM.X_cnn: fixed‑lengthlen(FEATURENAMES) × FIXEDLENtensors for the 1D‑CNN, whereFIXEDLEN = 2000. [file:37][file:39]
-
Performs 5‑fold match‑level cross‑validation
- Uses
StratifiedKFoldon match‑level labels to ensure no match is split between train and validation. [file:37][file:40] - Trains:
- Logistic Regression
- Random Forest (small grid search over
n_estimatorsandmax_depth) - XGBoost (small grid search)
- Simple Average ensemble (mean of the three classical probabilities)
- 1D‑CNN
- LSTM [file:37]
- Uses
-
Reports metrics and plots
- Logs accuracy, precision, recall, F1 per fold and model.
- Prints a LaTeX‑ready table of mean ± standard deviation across folds.
- Saves
model_boxplot_f1.pngas a boxplot of F1 scores across folds. - Saves averaged feature importance plots:
rf_feature_importance.pngxgb_feature_importance.png[file:37]
-
Retrains final models on all data and saves them
- Fits
StandardScaleron all statistical features and saves tomodels/scaler.pkl. - Trains and saves:
models/lr_model.pklmodels/rf_model.pklmodels/xgb_model.pklmodels/cnn_final.pthmodels/lstm_final.pthmodels/lstm_norm.npz(global mean/std for LSTM inputs) [file:37]
- Fits
These final models are the coordinate‑removed versions that align with the thesis’ final comparison tables. [file:37][file:40]
The unseen‑match analysis used in the thesis (toggling, full‑cheat, full‑clean) is implemented in:
python scripts/three_match_analysis.pyThis script:
-
Defines the three matches
In the
MATCHESlist: [file:38]- One toggling match:
data/demo20260510_1926_local_ac_scaffold_10min_DM.json,cn=2
- One full‑cheat match:
data/demo20260322_2136_local_ac_scaffold_10min_DM.json,cn=0
- One full‑clean match:
data/demo20260309_2147_local_ac_desert3_10min_DM.json,cn=0
- One toggling match:
-
Loads all saved models
models/scaler.pklmodels/lr_model.pklmodels/rf_model.pklmodels/xgb_model.pklmodels/cnn_final.pthmodels/lstm_final.pthmodels/lstm_norm.npz[file:38]
-
Runs sliding‑window inference
- Uses 30‑second windows with a 5‑second stride (
WINDOW_MS = 30000,STRIDE_MS = 5000). [file:38][file:40] - For each window:
- Classical models: compute statistical features, scale, predict probabilities.
- Deep models: build raw sequence slices, pad/crop for CNN (
FIXEDLEN = 2000), normalise for LSTM using the stored mean/std. [file:38][file:39]
- Uses 30‑second windows with a 5‑second stride (
-
Saves plots in
results/- Classical models:
results/toggling_classical.pngresults/full_cheat_classical.pngresults/full_clean_classical.png
- Deep models:
results/toggling_deep.pngresults/full_cheat_deep.pngresults/full_clean_deep.png[file:38]
- Classical models:
The repository also keeps legacy XYZ‑inclusive plots for comparison:
results/*_xyz_inclusive.pngshow how including raw coordinates made the models more aggressive but less defensible, especially viaymeddominance. [file:40]
The folder:
legacy models (xyz inclusive)/
contains models trained with the earlier coordinate‑inclusive feature set (raw x,y,z summaries included). These are retained for reproducibility and ablation comparison, but are not recommended for deployment because feature importance analysis shows they rely heavily on map‑position artefacts (especially ymed). [file:37][file:40]
For any future work or integration, use the models under models/ (coordinate‑removed) instead.
To reproduce the main results from the thesis:
-
Ensure all parsed JSON files are present under
data/as listed inDEMOFILESinsidescripts/model_comparison.py. [file:37] -
Run model comparison and training:
python scripts/model_comparison.pyThis will:
- Reproduce 5‑fold match‑level cross‑validation.
- Save the cross‑validation summary to stdout and
model_boxplot_f1.png. - Regenerate the final models in
models/. [file:37][file:40]
- Run three‑match analysis:
python scripts/three_match_analysis.pyThis will:
- Generate sliding‑window probability curves for the toggling, full‑cheat, and full‑clean matches.
- Save plots to
results/. [file:38][file:40]
The expected coordinate‑removed cross‑validation summary (Table 5.2 in the thesis) is: [file:40]
| Model | Accuracy | Precision | Recall | F1 |
|---|---|---|---|---|
| Logistic Regression | 0.871 | 0.872 | 0.880 | 0.872 |
| Random Forest | 0.855 | 0.900 | 0.800 | 0.842 |
| XGBoost | 0.834 | 0.851 | 0.823 | 0.831 |
| Simple Average | 0.862 | 0.869 | 0.860 | 0.861 |
| 1D‑CNN | 0.513 | 0.106 | 0.180 | 0.133 |
| LSTM | 0.547 | 0.535 | 0.600 | 0.558 |
This project is intended for:
- Academic research and teaching on behavioural anti‑cheat.
- Small‑data ML experimentation in games.
- Reproducible comparison of classical vs deep models on the same FPS dataset. [file:40]
It is not a production‑ready anti‑cheat or an automatic ban system. The detector is best framed as a human‑review triage layer that flags suspicious behaviour for manual investigation. [file:40]
Key limitations:
- Small, controlled dataset (29 matches, 106 sequences, one game, one cheat type).
- Participants and maps are limited.
- The aimbot is a modified open‑source cheat without advanced anti‑detection tricks. [file:40]
If you use this repository or build on this work, please cite the associated thesis:
@thesis{pettigrew2026aimbot,
author = {Luke Pettigrew},
title = {Small-Data Behavioural Aimbot Detection for Online FPS Security:
A Server-Side Machine Learning Approach},
school = {TU Dublin},
year = {2026}
}- Modified AssaultCube demo parser: see the
assaultcube-server-scriptsrepository referenced in Appendix A of the thesis. [file:40] - Modified A200K AssaultCube cheat (aimbot‑only) used for controlled data collection: see the
AssaultCubeHackrepository referenced in the thesis. [file:40]mbot-detector