Smart Contract Vulnerability Analyzer

Automated vulnerability detection for Ethereum smart contracts (Solidity). Uses a multi-engine pipeline — static analysis, IR analysis, call graph, taint tracking, symbolic verification, and LLM reasoning — to find security issues in Code4rena audit contests.

Setup

# Install Rust (if needed)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Build
cargo build --release

# Set your API key (required for LLM stages)
cp .env.example .env
# Edit .env with your ANTHROPIC_API_KEY

Usage

# Single contest analysis
cargo run -- --contest dataset/train/contracts/2024-01-curves

# Batch mode (all contests)
cargo run -- --batch

# Dry run (engines only, no LLM calls)
cargo run -- --contest dataset/train/contracts/2024-01-curves --dry-run

# Score predictions against labels
cargo run -- --batch --dry-run --score

# Custom output directory
cargo run -- --batch --output-dir /tmp/results

# Report formats: legacy (default), json, html
cargo run -- --batch --dry-run --output-format json
cargo run -- --batch --dry-run --output-format html --output-dir /tmp/reports

CLI Options

Flag	Description
`--contest <PATH>`	Contest directory to analyze
`--batch`	Run all contests in dataset
`--dataset <PATH>`	Dataset directory (default: `dataset/train`)
`--dry-run`	Skip LLM calls, run engines only
`--score`	Score predictions against labels
`--output-dir <PATH>`	Write per-contest results to directory
`--output-format <FMT>`	Output format: `legacy`, `json`, `html`
`--min-confidence <F>`	Minimum confidence threshold (0.0–1.0)
`--confidence-weights <PATH>`	Custom confidence weights JSON
`--no-cache`	Bypass analysis cache
`--clear-cache`	Clear cache and exit
`--verbose`	DEBUG-level tracing
`--quiet`	Errors only
`--log-format <FMT>`	`text` or `json`

Pipeline Stages

Scope filtering — reads scope.txt, restricts analysis to in-scope .sol files
Static analysis triage — filters ~80 detectors, drops noise slugs, classifies findings
IR analysis — contract structure, inheritance, state variables, function patterns
Call graph analysis — inter-function call relationships, external call patterns
Cross-contract analysis — multi-contract interactions, shared state
Taint analysis — tracks tainted data flow from sources (msg.sender, msg.value) to sinks (selfdestruct, delegatecall, storage writes)
Symbolic verification — validates/kills candidates using symbolic execution data
LLM analysis — per-file and project-level prompts with full context from prior stages
Post-processing — deduplication, confidence scoring, taxonomy normalization

flowchart LR
    A[main.rs]
    L[loader/<br/>scope, SA, IR,<br/>symbolic data]
    E[engines/<br/>SA, IR, call graph,<br/>cross-contract,<br/>taint, symbolic]
    LLM[LLM<br/>per-file +<br/>project pass]
    P[postprocess/<br/>parse, validate,<br/>dedup, confidence]
    R[report/<br/>legacy/json/html]
    S[score.rs]

    A -->|1| L
    L -->|2| E
    E -->|3| LLM
    LLM -->|4| P
    P -->|5| R
    R -->|6| S

Dataset

Training data: 10 Code4rena audit contests with labeled findings in dataset/train/labels.json.

Each contest provides pre-computed analysis data:

IR (ir.json) — contracts, functions, state variables, inheritance
Static analysis (static_analysis.json) — pattern-based detector findings
Symbolic execution (symbolic_execution.json) — execution paths and constraints (optional, not all contests have this)

Output Format

[
  {
    "contest": "2024-01-curves",
    "file": "contracts/Curves.sol",
    "vulnerability_type": "Reentrancy",
    "severity": "High",
    "description": "The sellCurvesToken function transfers ETH before updating state...",
    "confidence": 0.85,
    "source": "sa_reentrancy",
    "start_line": 42,
    "end_line": 55
  }
]

Scoring

# Score engine predictions against training labels
cargo run -- --batch --dry-run --score

# Score with LLM predictions
cargo run -- --batch --score

Outputs precision, recall, and F1 per contest and aggregate, with per-severity, per-detector, and confusion matrix breakdowns.

Name		Name	Last commit message	Last commit date
Latest commit History 113 Commits
.github/workflows		.github/workflows
dataset/train		dataset/train
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.rustfmt.toml		.rustfmt.toml
ARCHITECTURE.md		ARCHITECTURE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Makefile		Makefile
README.md		README.md
example_output.json		example_output.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Smart Contract Vulnerability Analyzer

Setup

Usage

CLI Options

Pipeline Stages

Dataset

Output Format

Scoring

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Smart Contract Vulnerability Analyzer

Setup

Usage

CLI Options

Pipeline Stages

Dataset

Output Format

Scoring

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages