AIris Security — Machine Learning Module

Hybrid risk scoring engine combining an XGBoost attack predictor, an NLP payload classifier, and CVE-enriched features.

For complete technical coverage of every file under ml/, see ml/ML_MODULE_DOCUMENTATION.md.

📌 Overview

The ML module provides the intelligent core of AIris Security. It is used in two ways:

Embedded in the backend — ml_service.py loads trained models at startup and runs run_hybrid_ml() after every scan
Standalone microservice — inference_api.py exposes a FastAPI endpoint on port 9000 (optional)

Core capabilities:

Capability	Implementation
Payload classification	TF-IDF + Logistic Regression (`payload_classifier.joblib`)
Attack type prediction	Multi-class classifier (`attack_predictor.joblib`)
Risk scoring	Hybrid XGBoost + NLP blend, then scanner-evidence boosts
CVE context	NVD data enrichment via `parse_cve_data.py`

🗂️ Structure

ml/
├── src/
│   ├── __init__.py
│   ├── build_payload_dataset.py    Build labelled payload CSV from raw sources
│   ├── build_attack_dataset.py     Build attack feature dataset from scan fixtures
│   ├── data_ingest.py              Data loading and validation utilities
│   ├── feature_pipeline.py         Feature extraction shared by training + inference
│   ├── parse_cve_data.py           Parse NVD JSON feeds → processed CSV
│   ├── train_payload_classifier.py Train TF-IDF + LogReg payload classifier
│   ├── train_attack_predictor.py   Train multi-class attack predictor
│   ├── inference.py                Inference engine (used by backend ml_service)
│   └── inference_api.py            Optional standalone FastAPI ML microservice
│
├── models/                         Saved models (output of training scripts)
│   ├── payload_classifier.joblib
│   ├── attack_predictor.joblib
│   └── attack_label_encoder.joblib
│
├── data/
│   ├── raw/                        Source data (not committed)
│   │   ├── cve/                    NVD JSON feeds
│   │   └── payloads/               Raw payload text files
│   └── processed/                  Cleaned CSVs ready for training
│       ├── payloads.csv
│       └── attack_features.csv
│
├── tests/
│   ├── test_inference.py           Unit tests for inference.py
│   └── test_real_site_simulation.py  Integration tests against fixture scan data
│
├── notebooks/                      EDA and evaluation notebooks
├── reports/                        Generated metrics / confusion matrices
├── requirements.txt
└── README.md

⚙️ Installation

cd ml
python -m venv .venv

# Windows
.venv\Scripts\Activate.ps1
# Linux / Mac
# source .venv/bin/activate

pip install -r requirements.txt

Key dependencies:

numpy>=1.21
pandas>=1.3
scikit-learn>=1.0
xgboost>=1.5
joblib>=1.1
nltk>=3.6
tqdm>=4.62
matplotlib>=3.4

📦 Data Preparation

1. CVE data

Download NVD JSON feeds and parse them:

# Download (example — 2023 feed)
# https://nvd.nist.gov/vuln/data-feeds

python src/parse_cve_data.py --input data/raw/cve/ --output data/processed/

Output columns: cve_id, description, severity, cvss_score, attack_vector

2. Payload dataset

Sources: PayloadsAllTheThings, SecLists, synthetically generated benign samples.

python src/build_payload_dataset.py
# Writes: data/processed/payloads.csv

Sample rows:

payload	label
`' OR '1'='1' --`	SQLI
`<script>alert(1)</script>`	XSS
`../../etc/passwd`	PATH_TRAVERSAL
`normal search query`	BENIGN

3. Attack feature dataset

Built from scan result fixtures:

python src/build_attack_dataset.py
# Writes: data/processed/attack_features.csv

Features: open_port_count, critical_port_flag, nikto_warning_count, ssl_issues_flag, dir_critical_count, cve_avg_severity, ...

🧠 Model Training

Payload Classifier

python src/train_payload_classifier.py

Algorithm: TF-IDF vectoriser + Logistic Regression
Input: Raw payload strings
Output classes: SQLI, XSS, PATH_TRAVERSAL, COMMAND_INJECTION, BENIGN
Saved to: models/payload_classifier.joblib
Typical accuracy: ~92 % on held-out test set

Attack Predictor

python src/train_attack_predictor.py

Algorithm: Multi-class XGBoost classifier
Input: Numerical scan features from feature_pipeline.py
Output classes: SQLi, XSS, RCE, Path Traversal, Weak SSL, Open Port, NONE, ...
Saved to: models/attack_predictor.joblib, models/attack_label_encoder.joblib

🔮 Inference

Embedded mode (used by backend)

from ml.src.inference import predict_attack

result = predict_attack(findings, scanner_results)
# Returns: {attack_type, risk_score, confidence, predicted_attack_types, cve_context}

Standalone microservice (optional)

uvicorn ml.src.inference_api:app --host 0.0.0.0 --port 9000

Input

{
  "scan": {
    "open_ports": [22, 80, 443],
    "nikto_warnings": 5,
    "ssl_issues": true,
    "dir_critical_count": 2,
    "scanner_text": "Possible SQLi detected in /search?q=",
    "cve_list": [{"severity": 9.8}, {"severity": 7.5}]
  }
}

Output

{
  "predicted_attack": "SQLi",
  "probabilities": {"SQLi": 0.91, "XSS": 0.05, "NONE": 0.04},
  "risk_score": 84,
  "explanation": "High SQL-related signals found."
}

🔀 Hybrid Risk Scoring (Backend Integration)

backend/app/services/ml_service.py implements the full hybrid pipeline:

Scanner findings
      │
      ├── XGBoost attack predictor  →  attack_prediction / confidence
      └── NLP Payload classifier    →  payload confidence score
                  │
            Hybrid risk blend
                  │
         Scanner evidence boosts:
           • SSL weak ciphers    +3 each  (cap +15)
           • Deprecated TLS      +5 flat
           • Exposed dir paths   +2 each  (cap +20)
           • Critical dir finds  +10 flat
                  │
           clamp [0, 100]
                  │
              risk_score

🎯 Advanced ML Output Features

AIris provides 6 comprehensive ML metrics (1 baseline + 5 enhanced features) for deep security analysis:

1. Risk Score (Baseline)

Range: 0-100
Purpose: Overall threat severity assessment
Calculation: Hybrid blend of ML confidence (60%) + scanner evidence (40%)
Output: Integer score with color-coded severity (Critical/High/Medium/Low)

2. Severity Distribution ✨ NEW

Purpose: Shape of risk - how findings are distributed across severity levels
Output Structure:

{
  "critical": 5,
  "high": 8,
  "medium": 12,
  "low": 6,
  "informational": 3,
  "total": 34,
  "percentages": {"critical": 14.7, "high": 23.5, ...},
  "shape": "top-heavy"
}

Shape Classifications:
- top-heavy: ≥50% critical/high findings (urgent action required)
- balanced: Mixed distribution (steady remediation)
- low-heavy: ≥50% low/info findings (hardened target)
Use Case: Understand risk concentration and prioritize remediation efforts

3. Attack Surface Score ✨ NEW

Range: 0-100
Purpose: Structural exposure measurement (independent of attack likelihood)
Scoring Breakdown:
- Port exposure (30 pts): Each open port = +3 points
- Path exposure (25 pts): Exposed web paths = +2.5 points each
- Protocol weakness (25 pts): Deprecated TLS + weak ciphers = +5 points each
- Service visibility (20 pts): Services with CVEs = +4 points each
Output: Integer score with exposure level (Minimal/Low/Moderate/High)
Use Case: Measure attack surface before implementing security controls

4. Threat Category ✨ NEW

Purpose: Attack type classification with model uncertainty analysis
Output Structure:

{
  "primary": {"attack_type": "SQL Injection", "probability": 0.68},
  "secondary": {"attack_type": "XSS", "probability": 0.22},
  "uncertainty_gap": 0.46,
  "confidence_level": "high"
}

Confidence Levels:
- high: Gap ≥ 0.4 (clear primary threat)
- medium: Gap 0.2-0.4 (monitor evolving threats)
- low: Gap < 0.2 (multiple attack vectors likely)
Use Case: Understand model certainty and prepare for multiple attack scenarios

5. Exploitability Index ✨ NEW

Range: 0-100
Purpose: CVSS-inspired ease-of-exploitation metric
Scoring Factors:
- Access Complexity (0-40): Remote service exposure
- Authentication Bypass (0-30): Auth vulnerability detection
- Impact Severity (0-30): CVE CVSS scores
Output Structure:

{
  "score": 78,
  "level": "high",
  "factors": {
    "access_complexity": 30,
    "authentication_required": 20,
    "impact_score": 28
  }
}

Levels: Critical (80+), High (60-79), Medium (40-59), Low (<40)
Use Case: Assess immediate exploitation risk and prioritize patching windows

6. Remediation Priorities ✨ NEW

Purpose: Prescriptive ranked action list with estimated risk reduction
Output Structure:

[
  {
    "priority": 1,
    "category": "Patch/Update",
    "finding_count": 8,
    "severity_breakdown": {"critical": 3, "high": 5},
    "estimated_risk_reduction": 35,
    "actions": [
      "Update Apache to 2.4.59 (CVE-2024-1234)",
      "Patch OpenSSL to 3.0.14 (CVE-2024-5678)"
    ]
  }
]

Categories:
- Patch/Update: CVE-related vulnerabilities requiring software updates
- Configuration: SSL/TLS settings, headers, server misconfigurations
- Access Control: Exposed paths, open ports, permission issues
- Input Validation: SQLi, XSS, and other injection vulnerabilities
Priority Calculation:
- Critical findings = 10 points each
- High findings = 5 points each
- Medium findings = 2 points each
- Boosted by exploitability level (1.3x-1.5x multiplier)
Use Case: Create actionable remediation roadmap with estimated impact

🧪 Testing

cd ml
pytest tests/

Tests cover:

test_inference.py — unit tests for predict_attack() with fixture data
test_real_site_simulation.py — integration test simulating a full scan result

📊 Model Evaluation

After training, evaluation reports are saved to reports/:

model_performance.html — interactive metrics (accuracy, F1, ROC)
confusion_matrix.png — multi-class confusion matrix

Payload classifier metrics (typical):

Metric	Value
Accuracy	~92 %
F1 (macro)	~0.91
Precision	~0.93
Recall	~0.90

📜 Dataset Licensing & Ethics

All training data is sourced from public, open-licence repositories:

Source	Licence	Use
PayloadsAllTheThings	MIT	Malicious payload samples
NVD / NIST	Public domain (US Govt)	CVE severity statistics
Synthetic benign samples	N/A — self-generated	Balance payload dataset
Kaggle SQLi/XSS datasets	CC0 / public	Additional payload labels

No proprietary, private, or personally identifiable data is used. No payloads are executed against real systems. Models detect attacks — they do not generate them.

Last updated: March 2026 · v2.1.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AIris Security — Machine Learning Module

📌 Overview

🗂️ Structure

⚙️ Installation

📦 Data Preparation

1. CVE data

2. Payload dataset

3. Attack feature dataset

🧠 Model Training

Payload Classifier

Attack Predictor

🔮 Inference

Embedded mode (used by backend)

Standalone microservice (optional)

Input

Output

🔀 Hybrid Risk Scoring (Backend Integration)

🎯 Advanced ML Output Features

1. Risk Score (Baseline)

2. Severity Distribution ✨ NEW

3. Attack Surface Score ✨ NEW

4. Threat Category ✨ NEW

5. Exploitability Index ✨ NEW

6. Remediation Priorities ✨ NEW

🧪 Testing

📊 Model Evaluation

📜 Dataset Licensing & Ethics

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

AIris Security — Machine Learning Module

📌 Overview

🗂️ Structure

⚙️ Installation

📦 Data Preparation

1. CVE data

2. Payload dataset

3. Attack feature dataset

🧠 Model Training

Payload Classifier

Attack Predictor

🔮 Inference

Embedded mode (used by backend)

Standalone microservice (optional)

Input

Output

🔀 Hybrid Risk Scoring (Backend Integration)

🎯 Advanced ML Output Features

1. Risk Score (Baseline)

2. Severity Distribution ✨ NEW

3. Attack Surface Score ✨ NEW

4. Threat Category ✨ NEW

5. Exploitability Index ✨ NEW

6. Remediation Priorities ✨ NEW

🧪 Testing

📊 Model Evaluation

📜 Dataset Licensing & Ethics