PhishGuard AI

An explainable phishing detection engine that analyzes URLs and emails in real time using feature engineering and heuristic risk scoring. It works offline and requires no API key.

Created and maintained by Omobolaji Adeyan, a cybersecurity engineer focused on practical Python security tooling, threat detection, and security automation.

Built because most phishing detection tools are either black-box cloud services or require expensive ML training pipelines. PhishGuard runs entirely offline and explains exactly why it flagged something.

How It Works

Rather than relying only on blocklists, PhishGuard extracts behavioral and structural features from URLs and email content, then applies an explainable, hand-tuned heuristic model. The current weights are informed by common phishing indicators and protected by regression tests; they have not yet been validated as a statistically trained model.

URL features analyzed:

Domain entropy (randomly generated domains score high)
IP address in URL (almost always malicious)
Suspicious TLDs (.xyz, .tk, .ml, .ga, .click)
Phishing keyword density (verify, suspended, account, secure, etc.)
Subdomain depth, path depth, digit ratio, special character density
Punycode and Unicode hostname indicators, weighted conservatively as context

Email features analyzed:

Urgency language (action required, account suspended, verify now)
Link and URL density
ALL CAPS word usage
Attachment mentions
Exclamation mark frequency
Optional SPF, DKIM, and DMARC results from a trusted receiver

Features

Real-time URL and email scoring with probability output
Batch scan a list of URLs from a file
Explainable results — see which features triggered the alert
Three verdict levels: SAFE, SUSPICIOUS, PHISHING
JSON export for integration into SOC workflows
SARIF 2.1.0 export for GitHub Code Scanning and CI security pipelines
Zero dependencies — pure Python standard library
Offline — no data sent anywhere

Try It in One Minute

The one-minute demo compares legitimate and suspicious inputs, displays the explainable feature breakdown, and exports a finding without using live phishing infrastructure.

See Project Evidence for dated benchmark results, release and contribution evidence, a reproducible demonstration, and explicit limits on what the current metrics establish. Watch the 18-second safe demo video.

Installation

Install the verified v0.5.1 wheel directly from GitHub Releases:

python -m pip install \
  https://github.com/omobolajiadeyan/phishguard-ai/releases/download/v0.5.1/phishguard_ai-0.5.1-py3-none-any.whl
phishguard --help

The release also includes a source archive, SHA256SUMS, and signed build provenance. See the v0.5.1 release for downloads and verification details.

GitHub Action

Use the stable Marketplace release to scan a URL in CI:

- name: Scan URL with PhishGuard AI
  uses: omobolajiadeyan/phishguard-ai@v0.5.1
  with:
    url: https://example.com
    sarif-output: phishguard-results.sarif

See the GitHub Marketplace listing for available inputs and version selection.

For development, install from a clone:

git clone https://github.com/omobolajiadeyan/phishguard-ai.git
cd phishguard-ai
python --version  # Python 3.10+ required
python -m pip install .
python -m unittest discover -s tests -v

Installation provides a phishguard command. Running the source file directly remains supported for development.

Usage

# Analyze a single URL
phishguard url "http://paypa1-secure-login.xyz/verify"

# Analyze with feature breakdown
phishguard url "https://google.com" --verbose

# Analyze an email
phishguard email \
  --subject "URGENT: Your account has been suspended" \
  --body "Click here immediately to verify your account or it will be deleted." \
  --authentication-results "mx.example; spf=fail; dkim=fail; dmarc=fail"

# Batch scan a list of URLs
phishguard batch data/urls.txt

# Use ASCII-only output in legacy terminals or CI logs
python phishguard.py url "https://google.com" --plain
python phishguard.py batch data/urls.txt --no-unicode

# Export results to JSON
phishguard batch data/urls.txt --output results.json

# Export actionable findings to SARIF 2.1.0
phishguard batch data/urls.txt \
  --format sarif \
  --output phishguard.sarif

See the GitHub Code Scanning guide for a copy-ready workflow using GitHub's official SARIF upload action. See the email JSON and SARIF examples for generated SPF, DKIM, and DMARC output and its authentication trust boundary. See the detection model documentation for feature semantics, limitations, and the evidence required for scoring changes.

Reproducible Benchmark

Run the public-safe URL regression fixture with:

python tools/evaluate_url_benchmark.py
python tools/evaluate_url_benchmark.py data/public_benchmark_urls.jsonl

The command reports ordered predictions, confusion-matrix counts, precision, recall, and false-positive rate. These are fixture metrics for detecting regressions, not population-level accuracy or calibration estimates. See the benchmark documentation for the synthetic fixture, the licensed URL-Phish-derived slice, sanitization, and reporting rules.

Example Output

  PHISHGUARD AI
  AI-powered phishing detection

────────────────────────────────────────────────────────────
  URL     : http://paypa1-secure-login.xyz/verify
  Verdict : PHISHING
  Risk    : ████████████████████  94.2%

  Feature breakdown:
    url_length           : 38
    has_ip_address       : 0
    suspicious_tld       : 1   *
    phishing_keywords    : 2   *
    has_https            : 0   *
    url_entropy          : 3.84 *

Architecture

phishguard-ai/
├── phishguard.py    # CLI entrypoint — commands: url, email, batch
├── email_auth.py    # SPF, DKIM, and DMARC result parsing
├── features.py      # Feature extraction (URL + email)
├── model.py         # Weighted scoring model + sigmoid normalisation
├── reporting.py     # Native JSON and SARIF 2.1.0 serialization
├── data/
│   └── urls.txt     # Sample URLs for batch testing
└── README.md

Contributing

Contributions are welcome from security analysts, Python developers, students, researchers, and first-time open-source contributors.

Read CONTRIBUTING.md before starting.
Follow the short first-contribution guide.
Follow the reproducible development workflow.
Pick a scoped task from the good first issue list.
Use Discussions for design questions and detection ideas.
See SUPPORT.md for the right place to ask questions, report bugs, or disclose vulnerabilities.
See ROADMAP.md for current priorities.
Accepted contributors are credited in AUTHORS.md.
Releases and notable changes are recorded in CHANGELOG.md.
Release artifacts follow the documented release process with checksums and signed build provenance.

Project Leadership

Creator and Lead Maintainer: Omobolaji Adeyan
LinkedIn: linkedin.com/in/oeadeyan
Security contact: omobolaji.adeyan@gmail.com

Author

Omobolaji Adeyan - Cybersecurity Engineer GitHub

License and Citation

PhishGuard AI is available under the MIT License. The project may be cited using the metadata in CITATION.cff.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PhishGuard AI

How It Works

Features

Try It in One Minute

Installation

GitHub Action

Usage

Reproducible Benchmark

Example Output

Architecture

Contributing

Project Leadership

Author

License and Citation

About

Uh oh!

Releases 5

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github		.github
data		data
docs		docs
tests		tests
tools		tools
.gitignore		.gitignore
AUTHORS.md		AUTHORS.md
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
GOVERNANCE.md		GOVERNANCE.md
LICENSE		LICENSE
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
SUPPORT.md		SUPPORT.md
action.yml		action.yml
email_auth.py		email_auth.py
features.py		features.py
model.py		model.py
phishguard.py		phishguard.py
pyproject.toml		pyproject.toml
redirect.py		redirect.py
reporting.py		reporting.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

PhishGuard AI

How It Works

Features

Try It in One Minute

Installation

GitHub Action

Usage

Reproducible Benchmark

Example Output

Architecture

Contributing

Project Leadership

Author

License and Citation

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages