Copyright Research Project

This repository contains code for training Denoising Diffusion Probabilistic Models (DDPMs) on CIFAR-10 and evaluating different attacks on these models, as part of our upcoming work on copyright protection for diffusion models.

Overview

The codebase supports:

Training standard diffusion models
Training models on different dataset splits
Running different attacks (MIA, DRA) and storing results.
Analyzing attack results.

Requirements

conda env create -f environment.yml

Dataset Preparation

Our experiments focus on CIFAR-10. For the CP-$k$ algorithm, we support training a proposal model $p$ on the entire dataset, and training smaller models $q_1$ and $q_2$ on halves of the dataset. You can select which subset you'd like to train with using the --set_index flag. DP models are trained on the entire dataset.

There is no explicit work necessary to download or prepare the CIFAR-10 dataset, as the code will automatically download the dataset when first run.

Training Diffusion Models

Basic Training

Training on Specific Dataset Splits

To train a general conditional diffusion model on the entirety of CIFAR-10, you can run the following command.

python main.py --train --set_index=3

To train on the dataset subsets used for the safe models, you may also use --set_index=1 or --set_index=2. To train with DP, you can set --privacy=True. Other flags can be set via command line or through the main.py file, depending on your preference.

Model Evaluation

To evaluate a trained model's FID, you can use the following commands.

python main.py --eval --logdir=<enter_logdir_here>

You can modify the main.py file's arguments, either via command line or through the main.py file, to evaluate using the CP-$k$ mechanism if desired.

Running and Evaluating Attacks

At a high level, we use the generator.py file to run attacks and the analysis.py file to analyze results. We use this structure because performing attacks can be computationally expensive, and we do not require separate runs for the CP-$k$ mechanism, as we store the log probabilities and threshold using the CP-$k$ mechanimsm when necessary during analysis, which gives greater flexibility to the analysis process.

Using `generator.py`

To run attacks, we use the generator.py file. This file contains the main function that runs attacks and saves results.

python generator.py --config=<path_to_config_file> --seed=<seed>

Using `analysis.py`

To analyze results, we use the analysis.py file. This file contains the main function that analyzes results and saves plots. You can find examples for analysis configs for different attacks in the config/analysis_config/ directory. When not using the CP-$k$ mechanism, the code will automatically use the log probabilities and threshold stored during the attack run.

When modifying the configuration files to match your generator runs, it is important that you select the outermost run directory (e.g. logs/generator_logs/cp_run_1/ as opposed to logs/generator_logs/cp_run_1/membership_inference/results/). The codebase automatically handles finding relevant files that contain the data from each run.

python analysis.py --config=<path_to_config_file> --seed=<seed>

You will need to update both analysis and generator configs to reflect your own model paths.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
ablation_scripts		ablation_scripts
config		config
examples		examples
figures		figures
score		score
utils		utils
.gitignore		.gitignore
CIFAR10_train_ratio0.5.npz		CIFAR10_train_ratio0.5.npz
LICENSE		LICENSE
README.md		README.md
analysis.py		analysis.py
attack.py		attack.py
components.py		components.py
cp_runner.py		cp_runner.py
diffusion.py		diffusion.py
environment.yml		environment.yml
generator.py		generator.py
main.py		main.py
model.py		model.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Copyright Research Project

Overview

Requirements

Dataset Preparation

Training Diffusion Models

Basic Training

Training on Specific Dataset Splits

Model Evaluation

Running and Evaluating Attacks

Using `generator.py`

Using `analysis.py`

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Copyright Research Project

Overview

Requirements

Dataset Preparation

Training Diffusion Models

Basic Training

Training on Specific Dataset Splits

Model Evaluation

Running and Evaluating Attacks

Using generator.py

Using analysis.py

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Using `generator.py`

Using `analysis.py`

Packages