Skip to content

Dnreikronos/vulnerability-analyzis

Repository files navigation

Smart Contract Vulnerability Analyzer

Automated vulnerability detection for Ethereum smart contracts (Solidity). Uses a multi-engine pipeline — static analysis, IR analysis, call graph, taint tracking, symbolic verification, and LLM reasoning — to find security issues in Code4rena audit contests.

Setup

# Install Rust (if needed)
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Build
cargo build --release

# Set your API key (required for LLM stages)
cp .env.example .env
# Edit .env with your ANTHROPIC_API_KEY

Usage

# Single contest analysis
cargo run -- --contest dataset/train/contracts/2024-01-curves

# Batch mode (all contests)
cargo run -- --batch

# Dry run (engines only, no LLM calls)
cargo run -- --contest dataset/train/contracts/2024-01-curves --dry-run

# Score predictions against labels
cargo run -- --batch --dry-run --score

# Custom output directory
cargo run -- --batch --output-dir /tmp/results

# Report formats: legacy (default), json, html
cargo run -- --batch --dry-run --output-format json
cargo run -- --batch --dry-run --output-format html --output-dir /tmp/reports

CLI Options

Flag Description
--contest <PATH> Contest directory to analyze
--batch Run all contests in dataset
--dataset <PATH> Dataset directory (default: dataset/train)
--dry-run Skip LLM calls, run engines only
--score Score predictions against labels
--output-dir <PATH> Write per-contest results to directory
--output-format <FMT> Output format: legacy, json, html
--min-confidence <F> Minimum confidence threshold (0.0–1.0)
--confidence-weights <PATH> Custom confidence weights JSON
--no-cache Bypass analysis cache
--clear-cache Clear cache and exit
--verbose DEBUG-level tracing
--quiet Errors only
--log-format <FMT> text or json

Pipeline Stages

  1. Scope filtering — reads scope.txt, restricts analysis to in-scope .sol files
  2. Static analysis triage — filters ~80 detectors, drops noise slugs, classifies findings
  3. IR analysis — contract structure, inheritance, state variables, function patterns
  4. Call graph analysis — inter-function call relationships, external call patterns
  5. Cross-contract analysis — multi-contract interactions, shared state
  6. Taint analysis — tracks tainted data flow from sources (msg.sender, msg.value) to sinks (selfdestruct, delegatecall, storage writes)
  7. Symbolic verification — validates/kills candidates using symbolic execution data
  8. LLM analysis — per-file and project-level prompts with full context from prior stages
  9. Post-processing — deduplication, confidence scoring, taxonomy normalization
flowchart LR
    A[main.rs]
    L[loader/<br/>scope, SA, IR,<br/>symbolic data]
    E[engines/<br/>SA, IR, call graph,<br/>cross-contract,<br/>taint, symbolic]
    LLM[LLM<br/>per-file +<br/>project pass]
    P[postprocess/<br/>parse, validate,<br/>dedup, confidence]
    R[report/<br/>legacy/json/html]
    S[score.rs]

    A -->|1| L
    L -->|2| E
    E -->|3| LLM
    LLM -->|4| P
    P -->|5| R
    R -->|6| S
Loading

Dataset

Training data: 10 Code4rena audit contests with labeled findings in dataset/train/labels.json.

Each contest provides pre-computed analysis data:

  • IR (ir.json) — contracts, functions, state variables, inheritance
  • Static analysis (static_analysis.json) — pattern-based detector findings
  • Symbolic execution (symbolic_execution.json) — execution paths and constraints (optional, not all contests have this)

Output Format

[
  {
    "contest": "2024-01-curves",
    "file": "contracts/Curves.sol",
    "vulnerability_type": "Reentrancy",
    "severity": "High",
    "description": "The sellCurvesToken function transfers ETH before updating state...",
    "confidence": 0.85,
    "source": "sa_reentrancy",
    "start_line": 42,
    "end_line": 55
  }
]

Scoring

# Score engine predictions against training labels
cargo run -- --batch --dry-run --score

# Score with LLM predictions
cargo run -- --batch --score

Outputs precision, recall, and F1 per contest and aggregate, with per-severity, per-detector, and confusion matrix breakdowns.

About

Rust-powered smart contract vulnerability analyzer — multi-engine pipeline combining static analysis, call graph, taint tracking, symbolic verification, and LLM reasoning for Solidity audits

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors