Overview

CHEP is a comprehensive security assessment toolkit designed for authorized penetration testing and vulnerability scanning. The application performs automated security testing of web applications and source code, identifying common vulnerabilities including XSS, SQL injection, SSRF, CSRF, SSTI, path traversal, command injection, XXE, IDOR, CORS misconfigurations, security header issues, and hardcoded secrets. It provides a command-line interface with terminal-based reporting and maintains a SQLite database to track scan history and findings.

Status: Fully functional CLI tool ready for authorized security testing

Recent Changes (2025-09-30)

Completed full implementation of all core modules
Fixed critical bug in StaticCodeAnalyzer that caused duplicate findings when scanning directories
Added Wayback Machine integration for discovering historical endpoints and URLs
Implemented advanced regex-based endpoint discovery for comprehensive crawling
Created POC generator supporting Python, Java, and Bash exploit scripts
Built self-updating vulnerability pattern system that fetches latest patterns from online sources (tested with 64 XSS and 45 SQLi patterns)
Integrated optional AI-enhanced analysis using local models (no API keys required)
Added comprehensive vulnerability detection patterns for web and code analysis
Implemented URL crawler with configurable depth
Added support for PHP static analysis alongside Python, JavaScript, and TypeScript
Created interactive menu system via run_chep.py for easy access to commands
All commands tested and verified working: scan, analyze, list, show, update

User Preferences

Preferred communication style: Simple, everyday language.

System Architecture

Application Architecture

Pattern: Command-line modular security scanner with SQLite persistence

The application follows a modular architecture where each security testing capability is encapsulated in its own analyzer module. The main entry point (chep.py) orchestrates these modules through a CLI interface, while scan results are persisted to a local SQLite database for historical tracking and reporting.

Rationale: This modular approach allows for independent development and testing of each security scanning capability while maintaining a unified interface for users.

Core Components

1. CLI Interface (`chep.py`, `run_chep.py`)

Purpose: Primary user interaction point with argument parsing
Features: Authorization warnings, command routing, scan orchestration
Design: Uses argparse for command handling with colorama for terminal output formatting

2. Database Layer (`database.py`)

Technology: SQLite with direct SQL queries
Schema: Three primary tables:
- scans: Tracks scan metadata (target, type, timestamp, status)
- vulnerabilities: Stores discovered security issues with severity ratings
- headers: Records security header analysis results
Rationale: SQLite chosen for zero-configuration local persistence without external database requirements

3. Security Analysis Modules

Header Analyzer (header_analyzer.py)

Evaluates HTTP security headers against best practices
Checks for HSTS, CSP, X-Frame-Options, and other protective headers
Provides remediation recommendations for missing headers

Vulnerability Scanner (vulnerability_scanner.py)

Pattern-based detection for common web vulnerabilities
Scans for: XSS, SQL injection, SSRF, CSRF, SSTI, path traversal, command injection, XXE, CORS misconfigurations, IDOR
Includes URL crawler with configurable depth for multi-page scanning
Uses regex patterns to identify potential attack vectors in responses

Static Code Analyzer (static_analyzer.py)

Source code security analysis for Python, JavaScript, TypeScript, and PHP
Detects: SQL injection patterns, command injection, hardcoded secrets, insecure deserialization, XSS vectors, file inclusion vulnerabilities
Supports both single-file and directory scanning with pattern matching
Fixed: Uses local variables to prevent finding duplication across multiple files

Reconnaissance Module (reconnaissance.py)

Information gathering capabilities
DNS resolution, server fingerprinting, technology detection
Checks for robots.txt, sitemaps, and subdomain hints

Wayback Machine Scanner (wayback_scanner.py)

Integrates with Internet Archive's Wayback Machine API
Discovers historical endpoints and URLs that may no longer be visible
No API key required - uses public Wayback CDX API
Filters and deduplicates discovered endpoints for efficient scanning

Endpoint Discovery Module (endpoint_discovery.py)

Advanced regex-based endpoint extraction from JavaScript and HTML
Discovers API endpoints, hidden URLs, and parameters
Extracts endpoints from inline JavaScript, external scripts, and HTML attributes
Provides comprehensive endpoint mapping for thorough testing

POC Generator (poc_generator.py)

Automatically generates proof-of-concept exploit scripts
Supports Python, Java, and Bash script formats
Generates POCs for XSS, SQLi, SSRF, CSRF, Command Injection, Path Traversal, SSTI, XXE, IDOR, and CORS
Saves scripts to pocs/ directory for immediate testing

Pattern Updater (pattern_updater.py)

Self-updating vulnerability detection patterns
Fetches latest XSS and SQL injection patterns from online sources
Maintains version control and update timestamps
Stores patterns in vulnerability_patterns.json
Can force updates or check automatically based on age

AI Enhancer (ai_enhancer.py)

Optional AI-powered vulnerability analysis
Supports local analysis without API keys (Hugging Face integration available)
Provides confidence scoring and context analysis for findings
Detects technologies and security misconfigurations
Enhances pattern-based detection with intelligent analysis

4. Reporting System (`report_generator.py`)

Format: Terminal-based colored output with optional JSON/text export
Content: Severity-based vulnerability categorization, detailed findings with evidence
Design: Structured reporting with scan metadata and actionable recommendations
Export Options: Terminal display, JSON file, plain text file

5. CLI Commands

scan: Scan URLs for vulnerabilities with multiple options:
- -a, --all: Perform all security checks
- -w, --wayback: Use Wayback Machine for historical endpoint discovery
- -e, --endpoints: Advanced regex-based endpoint discovery
- --ai: Enable AI-enhanced vulnerability analysis
- --generate-poc: Generate POC scripts in Python/Java/Bash
- -H, --headers: Analyze security headers
- -v, --vulnerabilities: Scan for vulnerabilities
- -r, --recon: Perform reconnaissance
- -d, --depth N: Set crawl depth
analyze: Perform static code analysis on files or directories
list: Display all previous scans from database
show: View detailed results from specific scan ID (supports --generate-poc)
update: Update vulnerability patterns from online sources (supports --force)

Security Design Decisions

Authorization Flow

Approach: Interactive authorization confirmation before any scanning
Implementation: Terminal prompt requiring explicit "yes" response
Rationale: Legal compliance and ethical hacking best practices enforcement

Detection Methods

Pattern-based scanning: Regex patterns for vulnerability detection
Pros: Fast execution, no false positives from active exploitation
Cons: May miss context-specific vulnerabilities, requires pattern maintenance
Alternative considered: Active exploitation testing (rejected due to safety concerns)

Data Storage

Local SQLite database for scan history
Pros: No external dependencies, portable, simple queries
Cons: Not suitable for multi-user environments or distributed scanning
Alternative considered: JSON flat files (rejected for querying capabilities)

Technology Stack

Language: Python 3
HTTP Client: requests library for web scanning
Database: SQLite3 (built-in)
Terminal UI: colorama for cross-platform colored output
Pattern Matching: Python re module for regex-based detection

External Dependencies

Python Libraries

Core Dependencies

requests: HTTP client for web scanning, Wayback API integration, and pattern updates
beautifulsoup4: HTML parsing for web crawling, link extraction, and endpoint discovery
lxml: XML/HTML parser used by BeautifulSoup
colorama: Cross-platform terminal color formatting for user interface
sqlite3: Built-in Python module for local database operations

Optional Dependencies (for AI features)

transformers: Hugging Face transformers library for AI-enhanced analysis (optional)
torch: PyTorch backend for ML models (optional)

Standard Library

argparse: Command-line argument parsing
re: Regular expression pattern matching for vulnerability detection
socket: DNS resolution and network operations
urllib.parse: URL parsing and manipulation
json: Data serialization for reports
datetime: Timestamp generation
subprocess: (Referenced in patterns, not actively used in provided code)
os: File system operations for static analysis

Database

SQLite: Embedded relational database

No external server required
Database file: chep.db (local filesystem)
Direct SQL query execution without ORM

External Services

Internet Archive Wayback Machine API

Optional: Historical endpoint discovery feature (-w, --wayback)
Public API, no authentication required
Endpoint: http://web.archive.org/cdx/search/cdx

Vulnerability Pattern Sources

Optional: Self-updating pattern system (update command)
Fetches XSS and SQLi patterns from online sources
Uses GitHub raw content and public vulnerability databases
No authentication required

Note: All external services are optional. Core scanning functionality works completely offline.

File System Dependencies

Read access to source code directories for static analysis
Write access for SQLite database file creation/updates
Optional file output for saved reports

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Recent Changes (2025-09-30)

User Preferences

System Architecture

Application Architecture

Core Components

1. CLI Interface (`chep.py`, `run_chep.py`)

2. Database Layer (`database.py`)

3. Security Analysis Modules

4. Reporting System (`report_generator.py`)

5. CLI Commands

Security Design Decisions

Authorization Flow

Detection Methods

Data Storage

Technology Stack

External Dependencies

Python Libraries

Core Dependencies

Optional Dependencies (for AI features)

Standard Library

Database

External Services

File System Dependencies

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
__pycache__		__pycache__
pocs		pocs
INSTALLATION.md		INSTALLATION.md
README.md		README.md
ai_enhancer.py		ai_enhancer.py
chep.db		chep.db
chep.py		chep.py
database.py		database.py
endpoint_discovery.py		endpoint_discovery.py
header_analyzer.py		header_analyzer.py
pattern_updater.py		pattern_updater.py
poc_generator.py		poc_generator.py
reconnaissance.py		reconnaissance.py
report_generator.py		report_generator.py
run_chep.py		run_chep.py
static_analyzer.py		static_analyzer.py
uv.lock		uv.lock
vulnerability_patterns.json		vulnerability_patterns.json
vulnerability_scanner.py		vulnerability_scanner.py
wayback_scanner.py		wayback_scanner.py

Folders and files

Latest commit

History

Repository files navigation

Overview

Recent Changes (2025-09-30)

User Preferences

System Architecture

Application Architecture

Core Components

1. CLI Interface (chep.py, run_chep.py)

2. Database Layer (database.py)

3. Security Analysis Modules

4. Reporting System (report_generator.py)

5. CLI Commands

Security Design Decisions

Authorization Flow

Detection Methods

Data Storage

Technology Stack

External Dependencies

Python Libraries

Core Dependencies

Optional Dependencies (for AI features)

Standard Library

Database

External Services

File System Dependencies

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. CLI Interface (`chep.py`, `run_chep.py`)

2. Database Layer (`database.py`)

4. Reporting System (`report_generator.py`)

Packages