Zero-Day Sandbox is a desktop phishing analysis tool built for the CyberSage hackathon. It combines browser automation, URL intelligence, HTML heuristics, domain reputation checks, optional VirusTotal enrichment, and Google Gemini reasoning to explain whether a site looks safe, suspicious, or malicious.
- Scans a user-provided URL in a headless browser.
- Captures a screenshot and extracts HTML signals.
- Checks domain age through RDAP and flags newly registered domains.
- Scores suspicious URL patterns such as typosquatting, IP-based URLs, long paths, and deceptive keywords.
- Reviews SSL certificate age and metadata.
- Optionally queries VirusTotal if a key is configured.
- Sends the screenshot plus structured security context to Gemini for multi-step reasoning.
- Combines all signals into a final
0-100risk score. - Shows explainable highlights, one-click safety explanations, recent scan history, and downloadable HTML reports.
Gemini is not used as a simple chat add-on. It is part of the actual detection pipeline:
- Gemini receives the live screenshot.
- Gemini receives structured signals from URL analysis, domain age, HTML extraction, SSL metadata, and VirusTotal.
- Gemini returns:
- extracted red flags
- a classification
- a confidence score
- beginner-friendly reasoning
- user safety advice
- The app uses Gemini output as one component of the final risk score and as the explanation layer in the UI.
That gives the project a meaningful Google Gemini integration for hackathon judging.
.
├── app.py
├── requirements.txt
├── .gitignore
├── SUBMISSION.md
├── README.md
├── CODE_ARCHITECTURE.md
├── .env.example
└── zero_day_sandbox/
├── __init__.py
├── __main__.py
├── config.py
├── utils.py
├── reputation.py
├── html_analysis.py
├── ai.py
├── risk.py
├── reporting.py
├── history.py
├── scanner.py
└── ui.py
Runtime-generated files such as .env, scan_history.json, reports/, and screenshots/ are created locally as needed and are intentionally left out of the clean submission bundle.
app.py: tiny launcher for the desktop app.zero_day_sandbox/config.py: shared constants and environment loading.zero_day_sandbox/utils.py: general-purpose helpers for parsing, validation, and text cleanup.zero_day_sandbox/reputation.py: domain age lookup, SSL inspection, URL scoring, and VirusTotal lookup.zero_day_sandbox/html_analysis.py: HTML feature extraction.zero_day_sandbox/ai.py: Gemini client, prompt construction, and follow-up Q&A.zero_day_sandbox/risk.py: weighted risk scoring and final verdict rules.zero_day_sandbox/reporting.py: screenshot annotation, HTML report generation, and result formatting.zero_day_sandbox/history.py: recent-scan persistence.zero_day_sandbox/scanner.py: end-to-end scan orchestration.zero_day_sandbox/ui.py: the CustomTkinter desktop interface.
python3 -m venv .venv./.venv/bin/pip install -r requirements.txt
./.venv/bin/playwright install chromiumCopy .env.example to .env and add your keys:
GEMINI_API_KEY="your_google_gemini_key"
VIRUSTOTAL_API_KEY="optional_virustotal_key"GEMINI_API_KEY is required.
VIRUSTOTAL_API_KEY is optional but recommended.
./.venv/bin/python app.pyYou can also run:
./.venv/bin/python -m zero_day_sandbox- Enter a URL.
- The app validates it and normalizes missing
https://. - The scanner checks domain age, SSL metadata, URL heuristics, and VirusTotal.
- Playwright captures the page and screenshot.
- HTML signals are extracted.
- Gemini analyzes the screenshot plus structured context.
- The risk engine combines all signals into a single score.
- The UI shows the verdict, explanations, screenshot, advice, and exportable report.
- Domain reputation checks with new-domain flagging
- URL risk scoring
- HTML phishing signal extraction
- Gemini multi-step reasoning
- Explainable verdicts and one-click answers
- Screenshot annotation
- Recent scan history
- Downloadable HTML reports
- Dark/light mode
- Show a newly registered-looking or typosquatted domain to demonstrate multi-signal detection.
- Use the quick explanation buttons after a scan to show Gemini’s UX value.
- Export the HTML report to emphasize product polish and real-world usability.
- Highlight that the app does not rely on a single signal like SSL age alone.
- Domain-age lookup uses RDAP, so internet access is needed for live reputation data.
- VirusTotal enrichment only runs when a key is configured.
- The UI stores the latest scan history locally in
scan_history.json. - Reports are saved in the
reports/folder. - The
screenshots/folder is now treated as temporary workspace and is auto-cleaned after scans and again when the app exits.
See CODE_ARCHITECTURE.md for a full module-by-module and function-by-function explanation.
For a paste-ready hackathon description and demo outline, see SUBMISSION.md.