Skip to content

0xLLUUKKEE/aimbot-detector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Aimbot Detector – AssaultCube Behavioural ML

TODO: clean up this repo

This repository contains the full machine-learning pipeline used in the thesis:

Small-Data Behavioural Aimbot Detection for Online FPS Security: A Server-Side Machine Learning Approach
Luke Pettigrew, TU Dublin, May 2026 [file:40]

The project implements a server-side behavioural aimbot detector for the open‑source FPS AssaultCube v1.2, using a realistically small, hand‑collected dataset and a like‑for‑like comparison of classical and deep models under the same evaluation protocol. [file:40]

The code here covers:

  • Parsing AssaultCube demo files into per‑tick telemetry (via a separately maintained parser)
  • Feature engineering for behaviour‑only classical models (no raw map coordinates)
  • Unified model comparison across Logistic Regression, Random Forest, XGBoost, a Simple Average ensemble, 1D‑CNN, and LSTM
  • Final model training and saving
  • Three‑match sliding‑window analysis (toggling, full‑cheat, full‑clean)
  • Result plots and saved models used in the thesis [file:37][file:38][file:39][file:40]

Repository layout

.
├── README.md                 # This file
├── .vscode/settings.json
├── data/                     # Parsed demos for three-match analysis
│   ├── demo20260309_2147_local_ac_desert3_10min_DM.json
│   ├── demo20260309_2217_local_ac_lainio_10min_DM.json
│   └── ... (additional parsed matches)
├── models/                   # Final coordinate‑removed models used in the thesis
│   ├── lr_model.pkl          # Logistic Regression (coordinate‑removed)
│   ├── rf_model.pkl          # Random Forest
│   ├── xgb_model.pkl         # XGBoost
│   ├── cnn_final.pth         # 1D‑CNN
│   ├── lstm_final.pth        # LSTM
│   ├── lstm_norm.npz         # LSTM normalisation mean/std
│   └── scaler.pkl            # StandardScaler for classical features
├── legacy models (xyz inclusive)/
│   ├── lr_model.pkl          # Legacy models trained with raw x,y,z
│   ├── rf_model.pkl
│   ├── xgb_model.pkl
│   ├── cnn_final.pth
│   ├── lstm_final.pth
│   ├── lstm_norm.npz
│   └── scaler.pkl
├── results/                  # Three‑match analysis plots
│   ├── full_cheat_classical.png
│   ├── full_cheat_classical_xyz_inclusive.png
│   ├── full_cheat_deep.png
│   ├── full_clean_classical.png
│   ├── full_clean_classical_before_xyz_removal.png
│   ├── full_clean_deep.png
│   ├── toggling_classical.png
│   ├── toggling_classical_xyz_inclusive.png
│   └── toggling_deep.png
├── scripts/
│   ├── model_comparison.py   # Main training / comparison script
│   ├── three_match_analysis.py  # Sliding‑window unseen‑match analysis
│   └── utils.py              # Parsing helpers, feature engineering, etc.
├── unparsed/                 # Raw AssaultCube .dmo demos and parser
│   ├── DmoParser.py
│   ├── demo20260306_2225_local_ac_desert_10min_DM.dmo
│   ├── demo20260306_2245_local_ac_desert3_10min_DM.dmo
│   ├── ... (additional raw demos)
│   └── parseall.sh           # UNIX / WSL only helper to batch‑parse demos
├── model_boxplot_f1.png      # Cross‑validation boxplot
├── rf_feature_importance.png # Random Forest feature importance (avg over folds)
├── xgb_feature_importance.png
├── toggling_analysis.png     # Older/summary toggling plots (if present)
└── toggling_comparison.png   # Older/summary comparison plot (if present)

The dataset summary, feature engineering and evaluation protocol match the description in the thesis (106 usable sequences from 29 matches, coordinate‑removed features, 5‑fold match‑level CV). [file:40]


Installation

This project uses Python, PyTorch, scikit‑learn, XGBoost, and standard scientific libraries. [file:37][file:39][file:40]

  1. Clone the repository:
git clone https://github.com/0xLLUUKKEE/aimbot-detector.git
cd aimbot-detector
  1. Create and activate a virtual environment (recommended):
python -m venv .venv
source .venv/bin/activate          # Linux/macOS
# or
.\.venv\Scripts\activate           # Windows
  1. Install dependencies (adjust if your actual file is named differently):
pip install -r requirements.txt

Data

The pipeline operates in two stages:

  1. Raw .dmo demos recorded from controlled AssaultCube matches live under unparsed/. [file:39][file:40]
  2. Parsed JSON event files live under data/ and are consumed by the training and analysis scripts. [file:37][file:39][file:40]

Raw demos (unparsed/)

  • AssaultCube v1.2 free‑for‑all 10‑minute matches.
  • Three match types: clean, full‑cheat, and mixed. [file:40]
  • Each match is stored as demoYYYYMMDD_HHMM_local_ac_<map>_10min_DM.dmo.
  • DmoParser.py is a modified parser from the separate assaultcube-server-scripts repo; it converts demos into line‑separated Python dict literals. [file:39][file:40]

Parsed events (data/)

  • Files in data/ are the parsed outputs of DmoParser.py, one position/chat event per line, stored as Python dict literals with fields such as cn, type, x, y, z, yaw, pitch, shooting, and gametime. [file:39][file:40]
  • These files are what scripts/model_comparison.py and scripts/three_match_analysis.py actually use. [file:37][file:38][file:39]

Parsing demos (optional)

If you add new .dmo demos under unparsed/, you can parse them into JSON using DmoParser.py. [file:39][file:40]

In the original workflow, parseall.sh was a WSL‑only helper script that looped over .dmo files and called the parser. It is not required for users on Linux/macOS, and is mainly kept as an example of how the original dataset was built. [file:39][file:40]

A typical manual parse command (adjust to your script’s CLI) might look like:

cd unparsed
python DmoParser.py demo20260306_2225_local_ac_desert_10min_DM.dmo > ../data/demo20260306_2225_local_ac_desert_10min_DM.json

If DmoParser.py has a different CLI signature (e.g., input and output arguments), update this section to reflect it.


Feature representation

All feature engineering logic lives in scripts/utils.py. [file:39][file:40]

Key points:

  • Per‑tick features (10 dimensions, behaviour‑only):
    • yaw, pitch
    • dx, dy, dz (movement deltas)
    • dyaw, dpitch (wrapped angle deltas)
    • shooting (0/1)
    • minangletoenemy (degrees)
    • aimcorrectiondelta (degrees) [file:39][file:40]
  • No raw x,y,z coordinates are passed into the final models to avoid map‑position bias; they are only used internally to derive movement and aim‑correction features. [file:39][file:40]

For classical models, each variable‑length sequence is summarised into a 77‑dimensional vector built from:

  • 7 statistics (mean, std, min, max, median, p5, p95) for each per‑tick feature, plus
  • 7 behavioural summary features (aim‑change autocorrelations, fraction of time shooting, number of shooting bursts, average speed, max aim‑correction, fraction of time on‑target). [file:39][file:40]

Deep models (1D‑CNN and LSTM) consume the raw per‑tick sequences (10 channels), with per‑fold normalisation for training and a global normalisation stored in models/lstm_norm.npz for the final LSTM. [file:37][file:39][file:40]


Model comparison (main entry point)

The main script for training and evaluating all models is:

python scripts/model_comparison.py

This script:

  1. Loads and labels sequences

    • Defines a DEMOFILES mapping of demo JSON paths to client numbers and cheater flags (clean, full‑cheat, mixed). [file:37]
    • Uses loadevents, buildplayertimelines, and buildfullsequence from utils.py to turn per‑tick events into per‑player sequences and labels. [file:37][file:39]
  2. Builds feature representations

    • X_stat: 77‑dimensional statistical features for classical models.
    • X_raw: variable‑length 10‑dimensional sequences for the LSTM.
    • X_cnn: fixed‑length len(FEATURENAMES) × FIXEDLEN tensors for the 1D‑CNN, where FIXEDLEN = 2000. [file:37][file:39]
  3. Performs 5‑fold match‑level cross‑validation

    • Uses StratifiedKFold on match‑level labels to ensure no match is split between train and validation. [file:37][file:40]
    • Trains:
      • Logistic Regression
      • Random Forest (small grid search over n_estimators and max_depth)
      • XGBoost (small grid search)
      • Simple Average ensemble (mean of the three classical probabilities)
      • 1D‑CNN
      • LSTM [file:37]
  4. Reports metrics and plots

    • Logs accuracy, precision, recall, F1 per fold and model.
    • Prints a LaTeX‑ready table of mean ± standard deviation across folds.
    • Saves model_boxplot_f1.png as a boxplot of F1 scores across folds.
    • Saves averaged feature importance plots:
      • rf_feature_importance.png
      • xgb_feature_importance.png [file:37]
  5. Retrains final models on all data and saves them

    • Fits StandardScaler on all statistical features and saves to models/scaler.pkl.
    • Trains and saves:
      • models/lr_model.pkl
      • models/rf_model.pkl
      • models/xgb_model.pkl
      • models/cnn_final.pth
      • models/lstm_final.pth
      • models/lstm_norm.npz (global mean/std for LSTM inputs) [file:37]

These final models are the coordinate‑removed versions that align with the thesis’ final comparison tables. [file:37][file:40]


Three‑match sliding‑window analysis

The unseen‑match analysis used in the thesis (toggling, full‑cheat, full‑clean) is implemented in:

python scripts/three_match_analysis.py

This script:

  1. Defines the three matches

    In the MATCHES list: [file:38]

    • One toggling match:
      • data/demo20260510_1926_local_ac_scaffold_10min_DM.json, cn=2
    • One full‑cheat match:
      • data/demo20260322_2136_local_ac_scaffold_10min_DM.json, cn=0
    • One full‑clean match:
      • data/demo20260309_2147_local_ac_desert3_10min_DM.json, cn=0
  2. Loads all saved models

    • models/scaler.pkl
    • models/lr_model.pkl
    • models/rf_model.pkl
    • models/xgb_model.pkl
    • models/cnn_final.pth
    • models/lstm_final.pth
    • models/lstm_norm.npz [file:38]
  3. Runs sliding‑window inference

    • Uses 30‑second windows with a 5‑second stride (WINDOW_MS = 30000, STRIDE_MS = 5000). [file:38][file:40]
    • For each window:
      • Classical models: compute statistical features, scale, predict probabilities.
      • Deep models: build raw sequence slices, pad/crop for CNN (FIXEDLEN = 2000), normalise for LSTM using the stored mean/std. [file:38][file:39]
  4. Saves plots in results/

    • Classical models:
      • results/toggling_classical.png
      • results/full_cheat_classical.png
      • results/full_clean_classical.png
    • Deep models:
      • results/toggling_deep.png
      • results/full_cheat_deep.png
      • results/full_clean_deep.png [file:38]

The repository also keeps legacy XYZ‑inclusive plots for comparison:

  • results/*_xyz_inclusive.png show how including raw coordinates made the models more aggressive but less defensible, especially via ymed dominance. [file:40]

Legacy XYZ‑inclusive models

The folder:

legacy models (xyz inclusive)/

contains models trained with the earlier coordinate‑inclusive feature set (raw x,y,z summaries included). These are retained for reproducibility and ablation comparison, but are not recommended for deployment because feature importance analysis shows they rely heavily on map‑position artefacts (especially ymed). [file:37][file:40]

For any future work or integration, use the models under models/ (coordinate‑removed) instead.


Reproducing the thesis results

To reproduce the main results from the thesis:

  1. Ensure all parsed JSON files are present under data/ as listed in DEMOFILES inside scripts/model_comparison.py. [file:37]

  2. Run model comparison and training:

python scripts/model_comparison.py

This will:

  • Reproduce 5‑fold match‑level cross‑validation.
  • Save the cross‑validation summary to stdout and model_boxplot_f1.png.
  • Regenerate the final models in models/. [file:37][file:40]
  1. Run three‑match analysis:
python scripts/three_match_analysis.py

This will:

  • Generate sliding‑window probability curves for the toggling, full‑cheat, and full‑clean matches.
  • Save plots to results/. [file:38][file:40]

The expected coordinate‑removed cross‑validation summary (Table 5.2 in the thesis) is: [file:40]

Model Accuracy Precision Recall F1
Logistic Regression 0.871 0.872 0.880 0.872
Random Forest 0.855 0.900 0.800 0.842
XGBoost 0.834 0.851 0.823 0.831
Simple Average 0.862 0.869 0.860 0.861
1D‑CNN 0.513 0.106 0.180 0.133
LSTM 0.547 0.535 0.600 0.558

Intended use and limitations

This project is intended for:

  • Academic research and teaching on behavioural anti‑cheat.
  • Small‑data ML experimentation in games.
  • Reproducible comparison of classical vs deep models on the same FPS dataset. [file:40]

It is not a production‑ready anti‑cheat or an automatic ban system. The detector is best framed as a human‑review triage layer that flags suspicious behaviour for manual investigation. [file:40]

Key limitations:

  • Small, controlled dataset (29 matches, 106 sequences, one game, one cheat type).
  • Participants and maps are limited.
  • The aimbot is a modified open‑source cheat without advanced anti‑detection tricks. [file:40]

Citation

If you use this repository or build on this work, please cite the associated thesis:

@thesis{pettigrew2026aimbot,
  author  = {Luke Pettigrew},
  title   = {Small-Data Behavioural Aimbot Detection for Online FPS Security:
             A Server-Side Machine Learning Approach},
  school  = {TU Dublin},
  year    = {2026}
}

Acknowledgements

  • Modified AssaultCube demo parser: see the assaultcube-server-scripts repository referenced in Appendix A of the thesis. [file:40]
  • Modified A200K AssaultCube cheat (aimbot‑only) used for controlled data collection: see the AssaultCubeHack repository referenced in the thesis. [file:40]mbot-detector

About

A server‑side behavioural aimbot detector for AssaultCube. Intended as a research/learning repo, not a production anti‑cheat.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors