[ICCV 2025] D³QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection

Dataset ARForensics is available at: [🤗 HuggingFace] | [🤖 ModelScope]

Created by Yanran Zhang, Bingyao Yu, Yu Zheng, Wenzhao Zheng, Yueqi Duan, Lei Chen, Jie Zhou, Jiwen Lu

Introduction

D³QE is a detection method designed to identify images generated by visual autoregressive (AR) models. The core idea is to exploit discrete distribution discrepancies and quantization error patterns produced by tokenized autoregressive generation. Key highlights:

Integrates dynamic codebook frequency statistics into a transformer attention module.
Fuses semantic image features with latent representations of quantization/quantizer error.
Demonstrates strong detection accuracy, cross-model generalization, and robustness to common real-world perturbations.

This repo contains the code, dataset, and scripts used in the paper to facilitate reproducible experiments.

News 🔥

🆕 2025-10-09 — Our code is released.
🆕 2025-10-08 — arXiv preprint released.
🆕 2025-07-23 — Accepted to ICCV 2025!🔥

Quick Start

Clone the repository:

git clone https://github.com/Zhangyr2022/D3QE
cd D3QE

Create the environment and install dependencies:

conda create -n D3QE python=3.11 -y
conda activate D3QE
pip install -r requirements.txt
# If you have GPU(s), ensure CUDA and PyTorch are installed correctly for your environment.

Download the dataset (see Dataset below) and place it under ./data/ARForensics (or a path you prefer). Download the pretrained LlamaGen vqvae model vq_ds16_c2i.pt from LlamaGen and place it under ./pretrained.
Train a model:

bash train.sh

Evaluate:

bash eval.sh

Dataset

We provide the ARForensics benchmark — the first large-scale dataset specifically for visual autoregressive model detection. 7 Autoregressive models included (diverse token/scale architectures): LlamaGen, VAR, Infinity, Janus-Pro, RAR, Switti, and Open-MAGVIT2.

Splits:

Training: 100k LlamaGen images + 100k ImageNet images
Validation: 10k LlamaGen images + 10k ImageNet images
Test: balanced test set with 6k samples per model

Download: The dataset ARForensics is uploaded and available at: 🤗 HuggingFace | 🤖 ModelScope.

Folder structure (expected):

ARForensics/
├─ train/
│  ├─ 0_real/
│  └─ 1_fake/
├─ val/
│  ├─ 0_real/
│  └─ 1_fake/
└─ test/
   ├─ Infinity/
   │  ├─ 0_real/
   │  └─ 1_fake/
   ├─ Janus_Pro/
   │  ├─ ..
   ├─ RAR/
   ├─ Switti/
   ├─ VAR/
   ├─ LlamaGen/
   └─ Open_MAGVIT2/

Training

A provided training script train.sh wraps the typical training pipeline. You can tweak the hyper-parameters directly in the script or by editing the training config file used by the codebase. We train the model on a single GPU by default. (24GB GPU memory recommended)

Example:

bash train.sh
# or run the training entrypoint directly, e.g.
python train.py \
    --name D3QE_rerun \
    --dataroot /path/to/your/dataset \
    --detect_method D3QE \
    --blur_prob 0.1 \
    --blur_sig 0.0,3.0 \
    --jpg_prob 0.1 \
    --jpg_method cv2,pil \
    --jpg_qual 30,100 \

Evaluation

eval.py exposes many options to evaluate detection performance and robustness.

usage: eval.py [-h] [--rz_interp RZ_INTERP] [--batch_size BATCH_SIZE]
               [--loadSize LOADSIZE] [--CropSize CROPSIZE] [--no_crop]
               [--no_resize] [--no_flip] [--robust_all]
               [--detect_method DETECT_METHOD] [--dataroot DATAROOT]
               [--sub_dir SUB_DIR] [--model_path MODEL_PATH]

Key flags:

--batch_size (default: 64)
--loadSize / --CropSize for image preprocessing (defaults: 256 / 224)
--robust_all to evaluate model robustness across different noises/attacks
--sub_dir list of subfolders in the test set (defaults to the 7 AR models)
--model_path path to your trained model checkpoint

Example (evaluate D³QE):

There's an eval.sh with default settings you can adapt.

bash eval.sh
# or run evaluation directly
python eval.py \
    --model_path /your/model/path \
    --detect_method D3QE  \
    --batch_size 1 \
    --dataroot /path/to/your/testset \
    --sub_dir '["Infinity","Janus_Pro","RAR","Switti","VAR","LlamaGen","Open_MAGVIT2"]'

Pretrained Models

Pretrained model checkpoints are uploaded at: 🤗 Hugging Face

Acknowledgments

This codebase builds on and borrows design patterns from:

Thanks to the authors of those projects for making their code and models available.

Citation

If you use this repository or dataset in your research, please cite our paper:

@inproceedings{zhang2025d3qe,
  title={D3QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection},
  author={Zhang, Yanran and Yu, Bingyao and Zheng, Yu and Zheng, Wenzhao and Duan, Yueqi and Chen, Lei and Zhou, Jie and Lu, Jiwen},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={16292--16301},
  year={2025}
}

Contact

For questions, issues, or reproducibility requests, please open an issue on this repository or contact the authors (PRs and issues are welcome), or reach out to: zhangyr21@mails.tsinghua.edu.cn

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
data		data
networks		networks
pretrained		pretrained
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval.py		eval.py
eval.sh		eval.sh
options.py		options.py
requirements.txt		requirements.txt
train.py		train.py
train.sh		train.sh
util.py		util.py
validate.py		validate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[ICCV 2025] D³QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection

Table of Contents

Introduction

News 🔥

Quick Start

Dataset

Training

Evaluation

Pretrained Models

Acknowledgments

Citation

Contact

About

Uh oh!

Releases

Packages

Languages

License

Zhangyr2022/D3QE

Folders and files

Latest commit

History

Repository files navigation

[ICCV 2025] D³QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection

Table of Contents

Introduction

News 🔥

Quick Start

Dataset

Training

Evaluation

Pretrained Models

Acknowledgments

Citation

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages