Phishing URL Detector

A phishing URL detection application using machine learning, built with Starlette framework.

Features

URL Analysis: Advanced phishing detection using machine learning
Feature Extraction: Comprehensive URL feature analysis including:
- Address bar-based features
- Domain-based features
- Content-based features
Modern API Framework: Built with Starlette for high performance and async support
API Documentation: Automatic OpenAPI/Swagger documentation
Internationalization: Multi-language support (English and Spanish)
Web Interface: Clean and intuitive UI for URL analysis
Real-time Analysis: Immediate feedback on URL legitimacy
Detailed Reports: Comprehensive feature analysis for each URL check

Prerequisites

Python 3.10+
pip (Python package manager)

Installation

Clone the repository:

git clone <repository-url>
cd phishing-url-detector

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Copy the environment template:

cp .env.example .env

Configuration

Configure your .env file with appropriate values:

# Server
HTTP_SCHEMA=http
HOST=localhost
PROD=False
PORT=8000

# API Spec
OPENAPI_TITLE=Phishing URL Detection API
OPENAPI_DESCRIPTION=A phishing URL detection API using machine learning.
OPENAPI_VERSION=0.0.1

Running the Application

Start the development server:

python main.py

Assuming the default configuration, the application will be available at:

Web Interface: http://localhost:8000
API Documentation: http://localhost:8000/docs

Previews

Home

Interactive web interface for URL analysis with real-time results:

Select your preferred language:

API Documentation

Comprehensive API documentation with Swagger UI:

Project Structure

├── core/             # Core functionality
├── data/             # Data files
├── dtos/             # Data Transfer Objects
├── extractors/       # URL feature extractors
├── lib/              # Libraries and utilities
├── locales/          # Translation files
├── middlewares/      # Middleware components
├── models/           # ML models and data structures
├── notebooks/        # Jupyter notebooks for ML training
├── routers/          # API routes
├── services/         # Business logic
├── static/           # Static files
├── templates/        # HTML templates
├── tests/            # Test suite
└── utils/            # Utility functions

API Endpoints

POST /predict
- Analyzes a URL for phishing characteristics
- Request body: {"url": "https://example.com"}
- Response: Prediction results with detailed feature analysis

Development

Machine Learning Model

The model is trained using various URL features, such as:

URL length
Domain characteristics
Content analysis

Training notebooks are available in the notebooks/ directory.

URL Feature Extraction

Features are extracted using the URLFeaturesExtractor class, which analyzes:

Address bar features
Domain-based features
Content-based features

Internationalization

Supports multiple languages through JSON locale files:

English (en.json)
Spanish (es.json)

Testing

Run the test suite:

pytest

Coverage reports are automatically generated through GitHub Actions.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Support

If you find this project useful, give it a ⭐ on GitHub!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Phishing URL Detector

Table of Contents

Features

Prerequisites

Installation

Configuration

Running the Application

Previews

Home

API Documentation

Project Structure

API Endpoints

Development

Machine Learning Model

URL Feature Extraction

Internationalization

Testing

License

Support

About

Uh oh!

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
core		core
data		data
docs/images		docs/images
dtos		dtos
extractors		extractors
lib		lib
locales		locales
middlewares		middlewares
models		models
notebooks		notebooks
routers		routers
services		services
static/icons		static/icons
templates		templates
tests		tests
utils		utils
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
api.http		api.http
logger.conf		logger.conf
main.py		main.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt

License

daireto/phishing-url-detector

Folders and files

Latest commit

History

Repository files navigation

Phishing URL Detector

Table of Contents

Features

Prerequisites

Installation

Configuration

Running the Application

Previews

Home

API Documentation

Project Structure

API Endpoints

Development

Machine Learning Model

URL Feature Extraction

Internationalization

Testing

License

Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages