Skip to content

NJU-iSE/FUEL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

24 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

May the Feedback Be with You! Unlocking the Power of Feedback-Driven Deep Learning Framework Fuzzing via LLMs

๐Ÿ“‹ Introduction

FUEL (Feedback-driven fUzzing for dEep Learning frameworks via LLMs) is an advanced deep learning (DL) framework fuzzing tool designed to detect bugs in mainstream DL frameworks such as PyTorch and TensorFlow. FUEL combines the powerful generation LLM with the analysis LLM to fully leverage feedback information during the fuzzing loop, generating high-quality test cases to discover potential bugs in DL frameworks. Additionally, FUEL features a feedback-aware simulated annealing algorithm and program self-repair strategy, improving model diversity and validity, respectively.

๐ŸŽฏ Why FUEL?

๐Ÿ”ฅ Core Advantages

  • ๐Ÿค– Intelligent Code Generation: Leverages Large Language Models to generate complex and effective deep learning model test cases
  • ๐Ÿ”„ Feedback-Driven: Smart feedback mechanism based on code coverage, bug reports, and exception logs to continuously optimize test generation strategies via LLMs
  • โค๏ธโ€๐Ÿฉน Program Self-Repair: Automatically distinguishes between framework bugs and invalid test cases, then intelligently repairs invalid models using LLM-guided analysis
  • ๐Ÿ“Š Heuristic Search: Integrates heuristic algorithms like Feedback-Aware Simulated Annealing (FASA) for intelligent API operator selection
  • ๐Ÿ”ฌ Differential Testing: Supports multiple differential testing modes (hardware differences, compiler differences, etc.)
  • ๐Ÿ” Efficient Detection: Successfully discovered 104 new bugs, with 93 confirmed and 49 fixed

๐Ÿ› ๏ธ Key Features

  • โœ… Support for PyTorch and TensorFlow framework testing
  • โœ… Multiple differential testing modes (CPU/CUDA hardware differences, compiler differences)
  • โœ… Intelligent operator selection and combination
  • โœ… Real-time code coverage feedback
  • โœ… Exception detection and bug report generation
  • โœ… Configurable LLM backends (local models/API services)

๐Ÿ—๏ธ Project Structure

FUEL/
โ”œโ”€โ”€ ๐Ÿ“ config/           # Configuration files
โ”‚   โ”œโ”€โ”€ als_prompt/      # Analysis prompt configurations
โ”‚   โ”œโ”€โ”€ gen_prompt/      # Generation prompt configurations
โ”‚   โ”œโ”€โ”€ heuristic.yaml   # Heuristic algorithm configuration
โ”‚   โ””โ”€โ”€ model/           # LLM model configuration
โ”œโ”€โ”€ ๐Ÿ“ data/             # Data files
โ”‚   โ”œโ”€โ”€ pytorch_apis.txt # PyTorch API list
โ”‚   โ””โ”€โ”€ tensorflow_apis.txt # TensorFlow API list
โ”œโ”€โ”€ ๐Ÿ“ fuel/             # Core source code
โ”‚   โ”œโ”€โ”€ difftesting/     # Differential testing module
โ”‚   โ”œโ”€โ”€ exec/            # Code execution module
โ”‚   โ”œโ”€โ”€ feedback/        # Feedback mechanism module
โ”‚   โ”œโ”€โ”€ guidance/        # Heuristic search module
โ”‚   โ””โ”€โ”€ utils/           # Utility classes
โ”œโ”€โ”€ ๐Ÿ“ experiments/      # Experiment and evaluation scripts
โ””โ”€โ”€ ๐Ÿ“ results/          # Test result outputs

โš™๏ธ Experiment Setup

๐Ÿ’ป Hardware environment

Important

General test-bed requirements

  • OS: Ubuntu >= 20.04;
  • CPU: X86/X64 CPU;
  • GPU: CUDA architecture (V100, A6000, A100, etc.);
  • Memory: 128GB GPU Memory available (if you use 72B local model with vLLM);
  • Storage: at least 100GB Storage available;
  • Network: Good Network to GitHub and LLM API service;

๐Ÿ“ฆ Software requirement

You need a DeepSeek API key to invoke the DeepSeek API service (of course, you can modify the configuration in ./config/model.yaml)

๐Ÿš€ Quick Start

๐Ÿ“ฅ clone the repository

git clone https://github.com/NJU-iSE/FUEL.git
cd FUEL

๐Ÿ”ง Install dependencies

Firstly, we should install some necessary Python dependencies. We strongly recommend users use uv to manage the Python environments. Please follow the commands below.

# install uv
curl -LsSf https://astral.sh/uv/install.sh | sh
# sync the dependencies at the root directory
uv sync
# activate the environment
source .venv/bin/activate

โšก Install PyTorch nightly version

When fuzzing the systems under tests (SUTs), we use the nightly version, in order to detect new bugs.

Here we use CUDA 12.6 as an example. Please install the nightly version based on your CUDA version. You can get the corresponding commands from https://pytorch.org/

UV_HTTP_TIMEOUT=180 uv pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu126

๐Ÿ”‘ create API key

In our experiment, we use DeepSeek API to invoke the LLM service. DeepSeek API service is compatible with openai interfaces.

For the below command, you should replace [YOUR_API_KEY] with your own DeepSeek API key.

key="[YOUR_API_KEY]"
echo "$key" > ./config/deepseek-key.txt

๐Ÿƒ Start fuzzing

Warning

The fuzzing process is time-consuming and may run for many hours to discover meaningful bugs.

python -m fuel.fuzz --lib pytorch run_fuzz \
                    --max_round 1000 \
                    --heuristic FASA \
                    --diff_type cpu_compiler

๐Ÿ“ƒ Parameter Description:

  • --lib: Target deep learning library (pytorch or tensorflow)
  • --max_round: Maximum number of testing rounds
  • --heuristic: Heuristic algorithm (FASA, Random, or None)
  • --diff_type: Differential testing type (hardware, cpu_compiler, cuda_compiler)

Note that the fuzzing experiment is really time-consuming. Maybe you should check the results after about ~20hours.

๐Ÿ–จ๏ธ Check results

Please check the generated models in results/fuel/pytorch. If you want to get the detected bugs, please check outputs/bug_reports.txt.

๐Ÿ”ง Advanced Usage

Warning

These advanced features are not fully tested and are prone to instability. We will continue improving our artifact.

๐ŸŽฎ Using Local LLM Models

python -m fuel.fuzz --lib pytorch run_fuzz \
                    --use_local_gen \
                    --max_round 1000 \
                    --heuristic FASA

๐Ÿ‘Š Custom Operator Selection

python -m fuel.fuzz --lib pytorch run_fuzz \
                    --op_set data/custom_operators.txt \
                    --op_nums 8 \
                    --max_round 1000

๐Ÿ“ˆ Code Coverage Analysis

bash coverage.sh

๐Ÿšจ Bug finding (Real-world Contribution)

So far, FUEL has detected 104 previously unknown new ๐Ÿ›bugs, with 93 already ๐Ÿฅฐconfirmed and 49 already ๐Ÿฅณfixed. 14 detected bugs were labeled as ๐Ÿšจhigh-priority, and one was labeled as ๐Ÿคฏutmost priority. 14 detected bugs has been assigned ๐ŸžCVE IDs. The evidence can be found in Google Sheet.

๐Ÿ“ก Contact

๐Ÿ™ Acknowledgement

We thank NNSmith, TitanFuzz, and WhiteFox for their admirable open-source spirit, which has largely inspired this work.

About

This repo is the artifact of FUEL

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •