An explainable phishing detection engine that analyzes URLs and emails in real time using feature engineering and heuristic risk scoring. It works offline and requires no API key.
Created and maintained by Omobolaji Adeyan, a cybersecurity engineer focused on practical Python security tooling, threat detection, and security automation.
Built because most phishing detection tools are either black-box cloud services or require expensive ML training pipelines. PhishGuard runs entirely offline and explains exactly why it flagged something.
Rather than relying only on blocklists, PhishGuard extracts behavioral and structural features from URLs and email content, then applies an explainable, hand-tuned heuristic model. The current weights are informed by common phishing indicators and protected by regression tests; they have not yet been validated as a statistically trained model.
URL features analyzed:
- Domain entropy (randomly generated domains score high)
- IP address in URL (almost always malicious)
- Suspicious TLDs (
.xyz,.tk,.ml,.ga,.click) - Phishing keyword density (
verify,suspended,account,secure, etc.) - Subdomain depth, path depth, digit ratio, special character density
- Punycode and Unicode hostname indicators, weighted conservatively as context
Email features analyzed:
- Urgency language (
action required,account suspended,verify now) - Link and URL density
- ALL CAPS word usage
- Attachment mentions
- Exclamation mark frequency
- Optional SPF, DKIM, and DMARC results from a trusted receiver
- Real-time URL and email scoring with probability output
- Batch scan a list of URLs from a file
- Explainable results — see which features triggered the alert
- Three verdict levels:
SAFE,SUSPICIOUS,PHISHING - JSON export for integration into SOC workflows
- SARIF 2.1.0 export for GitHub Code Scanning and CI security pipelines
- Zero dependencies — pure Python standard library
- Offline — no data sent anywhere
The one-minute demo compares legitimate and suspicious inputs, displays the explainable feature breakdown, and exports a finding without using live phishing infrastructure.
See Project Evidence for dated benchmark results, release and contribution evidence, a reproducible demonstration, and explicit limits on what the current metrics establish. Watch the 18-second safe demo video.
Install the verified v0.5.1 wheel directly from GitHub Releases:
python -m pip install \
https://github.com/omobolajiadeyan/phishguard-ai/releases/download/v0.5.1/phishguard_ai-0.5.1-py3-none-any.whl
phishguard --helpThe release also includes a source archive, SHA256SUMS, and signed build
provenance. See the
v0.5.1 release
for downloads and verification details.
Use the stable Marketplace release to scan a URL in CI:
- name: Scan URL with PhishGuard AI
uses: omobolajiadeyan/phishguard-ai@v0.5.1
with:
url: https://example.com
sarif-output: phishguard-results.sarifSee the GitHub Marketplace listing for available inputs and version selection.
For development, install from a clone:
git clone https://github.com/omobolajiadeyan/phishguard-ai.git
cd phishguard-ai
python --version # Python 3.10+ required
python -m pip install .
python -m unittest discover -s tests -vInstallation provides a phishguard command. Running the source file directly
remains supported for development.
# Analyze a single URL
phishguard url "http://paypa1-secure-login.xyz/verify"
# Analyze with feature breakdown
phishguard url "https://google.com" --verbose
# Analyze an email
phishguard email \
--subject "URGENT: Your account has been suspended" \
--body "Click here immediately to verify your account or it will be deleted." \
--authentication-results "mx.example; spf=fail; dkim=fail; dmarc=fail"
# Batch scan a list of URLs
phishguard batch data/urls.txt
# Use ASCII-only output in legacy terminals or CI logs
python phishguard.py url "https://google.com" --plain
python phishguard.py batch data/urls.txt --no-unicode
# Export results to JSON
phishguard batch data/urls.txt --output results.json
# Export actionable findings to SARIF 2.1.0
phishguard batch data/urls.txt \
--format sarif \
--output phishguard.sarifSee the GitHub Code Scanning guide for a copy-ready workflow using GitHub's official SARIF upload action. See the email JSON and SARIF examples for generated SPF, DKIM, and DMARC output and its authentication trust boundary. See the detection model documentation for feature semantics, limitations, and the evidence required for scoring changes.
Run the public-safe URL regression fixture with:
python tools/evaluate_url_benchmark.py
python tools/evaluate_url_benchmark.py data/public_benchmark_urls.jsonlThe command reports ordered predictions, confusion-matrix counts, precision, recall, and false-positive rate. These are fixture metrics for detecting regressions, not population-level accuracy or calibration estimates. See the benchmark documentation for the synthetic fixture, the licensed URL-Phish-derived slice, sanitization, and reporting rules.
PHISHGUARD AI
AI-powered phishing detection
────────────────────────────────────────────────────────────
URL : http://paypa1-secure-login.xyz/verify
Verdict : PHISHING
Risk : ████████████████████ 94.2%
Feature breakdown:
url_length : 38
has_ip_address : 0
suspicious_tld : 1 *
phishing_keywords : 2 *
has_https : 0 *
url_entropy : 3.84 *
phishguard-ai/
├── phishguard.py # CLI entrypoint — commands: url, email, batch
├── email_auth.py # SPF, DKIM, and DMARC result parsing
├── features.py # Feature extraction (URL + email)
├── model.py # Weighted scoring model + sigmoid normalisation
├── reporting.py # Native JSON and SARIF 2.1.0 serialization
├── data/
│ └── urls.txt # Sample URLs for batch testing
└── README.md
Contributions are welcome from security analysts, Python developers, students, researchers, and first-time open-source contributors.
- Read CONTRIBUTING.md before starting.
- Follow the short first-contribution guide.
- Follow the reproducible development workflow.
- Pick a scoped task from the
good first issuelist. - Use Discussions for design questions and detection ideas.
- See SUPPORT.md for the right place to ask questions, report bugs, or disclose vulnerabilities.
- See ROADMAP.md for current priorities.
- Accepted contributors are credited in AUTHORS.md.
- Releases and notable changes are recorded in CHANGELOG.md.
- Release artifacts follow the documented release process with checksums and signed build provenance.
- Creator and Lead Maintainer: Omobolaji Adeyan
- LinkedIn: linkedin.com/in/oeadeyan
- Security contact: omobolaji.adeyan@gmail.com
Omobolaji Adeyan - Cybersecurity Engineer GitHub
PhishGuard AI is available under the MIT License. The project may be cited using the metadata in CITATION.cff.