Skip to content

beanduan22/VistaFuzz

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 

Repository files navigation

Harnessing LLMs for Document-Guided Fuzzing of OpenCV Library (VistaFuzz)

DOI GitHub release License

Documentation-guided fuzzing for OpenCV-Python. Reproducible artifact with Docker, scripts, and standardized API.


Table of Contents


Overview

VistaFuzz is a documentation-guided fuzzer for OpenCV-Python. It parses API documentation, standardizes API constraints, and generates valid/diverse inputs to exercise OpenCV operations at scale.

This repository provides:

  • Runnable fuzzing entry: main.py
  • Docker setup for a consistent environment and code coverage
  • Standardized API metadata: API_info.py used by the fuzzer

Repository Layout

VistaFuzz/
├─ OpenCV-Testing/
│ ├─ API/
│ │ └─ OpenCV_API_filtered_subset.json
│ ├─ main/
│ │ └─ main.py
│ ├─ tool/
│ │ ├─ API_info.py
│ │ ├─ load_json_file.py
│ │ ├─ mutation.py
│ │ ├─ mutation_1.py
│ │ ├─ mutation_rules.py
│ │ ├─ opencv_args_seed_generator.py
│ │ ├─ oracle.py
│ │ ├─ parser_from_str2funcandinfo.py
│ │ ├─ temporary.py
│ │ └─ test.py
│ ├─ API_info.py
│ ├─ BugLinks.csv
│ ├─ Dockerfile
│ └─ main.py
└─ README.md

Quickstart (Docker)

Recommended for reproducibility.

  1. Build the Docker image
docker build -t opencv-coverage .
  1. Run the container (mount this repo)
  • Linux/macOS
docker run -it --rm -v "$PWD/OpenCV-Testing:/app" --name opencv_coverage_container_1 opencv-coverage
  • Windows PowerShell
docker run -it --rm -v "${PWD}\OpenCV-Testing:/app" --name opencv_coverage_container_1 opencv-coverage
  1. Inside the container: run the fuzzer
cd /app
python3 main.py

Tip: Use ls -l to confirm the volume is mounted; logs are written to the current working directory.


Generate Coverage (gcovr)

  1. Enter the OpenCV build directory (inside the container)
cd /usr/local/src/opencv/build/
  1. Install gcovr
pip install gcovr
  1. Generate an HTML coverage report
gcovr -r /usr/local/src/opencv --html --html-details -o coverage_report.html
  1. Copy the report back to the host
  • Find the container ID:
docker ps
  • Copy the report:
docker cp <container_id>:/usr/local/src/opencv/build/coverage_report.html .
  • Open the report:

macOS

open ./coverage_report.html

Windows (PowerShell)

start ./coverage_report.html

Linux

xdg-open ./coverage_report.html

Standardized API

VistaFuzz consumes standardized API to guide input generation:

OpenCV-Testing/API_info.py

This file describes each API's name, parameters, types/constraints, etc., which the fuzzer uses to synthesize valid and diverse inputs.


Optional: Run without Docker

Not recommended — native builds can diverge from the container. If you still choose a local run, mirror the container toolchain and follow this checklist:

  • Expect small variations in coverage or numerics across machines due to BLAS/hardware differences.

Run a Small Subset of APIs

If you only want to run a subset of the API dataset, you can limit how many APIs main.py processes:

  • Preferred (if supported by your main.py): pass a limit flag, e.g. --max-apis N.

    cd /app
    python3 main.py --max-apis 50
  • Otherwise (quick code tweak): at the top of OpenCV-Testing/main/main.py, add a limit and slice the loaded list. For example:

    import os
    MAX_APIS = int(os.getenv("VISTAFUZZ_MAX_APIS", "0"))  # 0 = no limit
    
    apis = load_json_file('API/OpenCV_API_filtered_subset.json')
    if MAX_APIS > 0:
        apis = apis[:MAX_APIS]

    Then run with an environment variable:

    VISTAFUZZ_MAX_APIS=50 python3 main.py

Paper ↔ Artifact Mapping

A lightweight map from paper items to where they live in this repo.

Paper item Where in repo / quick check
Method pipeline OpenCV-Testing/main.py, OpenCV-Testing/tool/ — running main.py exercises the full pipeline.
Standardized API dataset OpenCV-Testing/API/OpenCV_API_filtered_subset.json, OpenCV-Testing/tool/API_info.py
Input generation & mutations OpenCV-Testing/tool/opencv_args_seed_generator.py, OpenCV-Testing/tool/mutation*.py
Oracles (Crash/NaN/Exception) OpenCV-Testing/tool/oracle.py, OpenCV-Testing/tool/test.py
Scale & bug list OpenCV-Testing/API/OpenCV_API_filtered_subset.json (API count), OpenCV-Testing/BugLinks.csv
Coverage reproduction Dockerfile and README section Generate Coverage (gcovr)

Expected Outputs

  • Running python3 main.py outputs which APIs/test cases were executed and runtime logs.

  • Running gcovr produces an HTML coverage report at:

    /usr/local/src/opencv/build/coverage_report.html
    

    Copy it to your host and open it in a browser. The report includes line-by-line coverage with source highlighting.

Note: Coverage values depend on runtime and the number of generated test cases; the primary verification goal is successful report generation and navigable source coverage.


Cite This Paper/Artifact

@INPROCEEDINGS{VistaFuzz,
  author={Duan, Bin and Mahmud, Tarek and Che, Meiru and Yan, Yan and Dong, Naipeng and Kim, Dan Dongseong and Yang, Guowei},
  booktitle={2025 IEEE International Conference on Software Maintenance and Evolution (ICSME)}, 
  title={Harnessing LLMs for Document-Guided Fuzzing of OpenCV Library}, 
  year={2025},
  pages={73-84},
  keywords={Computer vision;Software maintenance;Large language models;Fuzzing;Testing;OpenCV Libraries},
  doi={10.1109/ICSME64153.2025.00017}}

License

This project is released under the MIT License.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors