Skip to content

namelessweakl1ng/cybersage-hackathon

Repository files navigation

Zero-Day Sandbox

Zero-Day Sandbox is a desktop phishing analysis tool built for the CyberSage hackathon. It combines browser automation, URL intelligence, HTML heuristics, domain reputation checks, optional VirusTotal enrichment, and Google Gemini reasoning to explain whether a site looks safe, suspicious, or malicious.

What The App Does

  • Scans a user-provided URL in a headless browser.
  • Captures a screenshot and extracts HTML signals.
  • Checks domain age through RDAP and flags newly registered domains.
  • Scores suspicious URL patterns such as typosquatting, IP-based URLs, long paths, and deceptive keywords.
  • Reviews SSL certificate age and metadata.
  • Optionally queries VirusTotal if a key is configured.
  • Sends the screenshot plus structured security context to Gemini for multi-step reasoning.
  • Combines all signals into a final 0-100 risk score.
  • Shows explainable highlights, one-click safety explanations, recent scan history, and downloadable HTML reports.

Why Gemini Matters Here

Gemini is not used as a simple chat add-on. It is part of the actual detection pipeline:

  1. Gemini receives the live screenshot.
  2. Gemini receives structured signals from URL analysis, domain age, HTML extraction, SSL metadata, and VirusTotal.
  3. Gemini returns:
    • extracted red flags
    • a classification
    • a confidence score
    • beginner-friendly reasoning
    • user safety advice
  4. The app uses Gemini output as one component of the final risk score and as the explanation layer in the UI.

That gives the project a meaningful Google Gemini integration for hackathon judging.

Project Structure

.
├── app.py
├── requirements.txt
├── .gitignore
├── SUBMISSION.md
├── README.md
├── CODE_ARCHITECTURE.md
├── .env.example
└── zero_day_sandbox/
    ├── __init__.py
    ├── __main__.py
    ├── config.py
    ├── utils.py
    ├── reputation.py
    ├── html_analysis.py
    ├── ai.py
    ├── risk.py
    ├── reporting.py
    ├── history.py
    ├── scanner.py
    └── ui.py

Runtime-generated files such as .env, scan_history.json, reports/, and screenshots/ are created locally as needed and are intentionally left out of the clean submission bundle.

Module Overview

  • app.py: tiny launcher for the desktop app.
  • zero_day_sandbox/config.py: shared constants and environment loading.
  • zero_day_sandbox/utils.py: general-purpose helpers for parsing, validation, and text cleanup.
  • zero_day_sandbox/reputation.py: domain age lookup, SSL inspection, URL scoring, and VirusTotal lookup.
  • zero_day_sandbox/html_analysis.py: HTML feature extraction.
  • zero_day_sandbox/ai.py: Gemini client, prompt construction, and follow-up Q&A.
  • zero_day_sandbox/risk.py: weighted risk scoring and final verdict rules.
  • zero_day_sandbox/reporting.py: screenshot annotation, HTML report generation, and result formatting.
  • zero_day_sandbox/history.py: recent-scan persistence.
  • zero_day_sandbox/scanner.py: end-to-end scan orchestration.
  • zero_day_sandbox/ui.py: the CustomTkinter desktop interface.

Setup

1. Create a virtual environment

python3 -m venv .venv

2. Install dependencies

./.venv/bin/pip install -r requirements.txt
./.venv/bin/playwright install chromium

3. Create your environment file

Copy .env.example to .env and add your keys:

GEMINI_API_KEY="your_google_gemini_key"
VIRUSTOTAL_API_KEY="optional_virustotal_key"

GEMINI_API_KEY is required.

VIRUSTOTAL_API_KEY is optional but recommended.

4. Run the app

./.venv/bin/python app.py

You can also run:

./.venv/bin/python -m zero_day_sandbox

Typical Scan Flow

  1. Enter a URL.
  2. The app validates it and normalizes missing https://.
  3. The scanner checks domain age, SSL metadata, URL heuristics, and VirusTotal.
  4. Playwright captures the page and screenshot.
  5. HTML signals are extracted.
  6. Gemini analyzes the screenshot plus structured context.
  7. The risk engine combines all signals into a single score.
  8. The UI shows the verdict, explanations, screenshot, advice, and exportable report.

Key Features

  • Domain reputation checks with new-domain flagging
  • URL risk scoring
  • HTML phishing signal extraction
  • Gemini multi-step reasoning
  • Explainable verdicts and one-click answers
  • Screenshot annotation
  • Recent scan history
  • Downloadable HTML reports
  • Dark/light mode

Demo Tips

  • Show a newly registered-looking or typosquatted domain to demonstrate multi-signal detection.
  • Use the quick explanation buttons after a scan to show Gemini’s UX value.
  • Export the HTML report to emphasize product polish and real-world usability.
  • Highlight that the app does not rely on a single signal like SSL age alone.

Notes

  • Domain-age lookup uses RDAP, so internet access is needed for live reputation data.
  • VirusTotal enrichment only runs when a key is configured.
  • The UI stores the latest scan history locally in scan_history.json.
  • Reports are saved in the reports/ folder.
  • The screenshots/ folder is now treated as temporary workspace and is auto-cleaned after scans and again when the app exits.

Deep Technical Guide

See CODE_ARCHITECTURE.md for a full module-by-module and function-by-function explanation.

For a paste-ready hackathon description and demo outline, see SUBMISSION.md.

About

phishing detection.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors