Training Ensembles to Detect Adversarial Examples

See the paper here.

Requisites

Python 3, NumPy, and TensorFlow

Easy setup

A makefile is included for training and evaluating the ensembles described in the paper.

First train an ensemble:

make train DATASET=x where x is either mnist or cifar

Then generate some adversarial examples targeting it:

make gen DATASET=x ATTACK=y where y is FGS, BI, DF, CW, or RAND

Then evaluate it against the generated examples:

make eval DATASET=x

Read below if you wish to experiment with different parameters.

Training an ensemble

The file train.py can be used to train an ensemble from scratch. Some important parameters:

-n, --ensemble_size to set the number of ensemble members
--learning_rate to set the initial learning rate
--eta to set the eta parameter to control random perturbation
-d, --dataset to choose between MNIST and CIFAR10

See the file for other parameters.

Example

python3 train.py -n 5 --dataset MNIST --learning_rate 0.1 --max_steps 100000 --eta 0.1 --model_dir models/myensemble

Generating adversarial examples

The file gen_adv.py can be used to generate adversarial examples using the following methods:

0: Fast gradient sign
1: Basic iterative
2: DeepFool
3: C&W l2
4: Random noise

Use -t or --type to choose the attack method by its numeric index shown above.

Use --direct to save the adversarial examples directly in adv_examples/

See the file for other parameters.

Example

python3 gen_adv.py -n 5 --dataset MNIST --model_dir models/myensemble --attack 0 --epsilon 0.1

Evaluating an ensemble

The file eval.py can be used to evaluate an ensemble's performance against both clean and adversarial examples.

Some parameters:

-rt, --rank_threshold to set the detection parameter tau
-s, --set to choose between the test and validation sets

See the file for other parameters.

Example

python3 eval.py -n 5 --dataset MNIST --model_dir models/myensemble --rank_threshold 2

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
adv_lib.py		adv_lib.py
batch_gen.py		batch_gen.py
cache.py		cache.py
cifar10_model.py		cifar10_model.py
cw_attack.py		cw_attack.py
dataset.py		dataset.py
dataset_params.py		dataset_params.py
displayNetwork.py		displayNetwork.py
download.py		download.py
download_cifar10.py		download_cifar10.py
eval.py		eval.py
gen_adv.py		gen_adv.py
l2_attack.py		l2_attack.py
mean_distortion.py		mean_distortion.py
mnist_model.py		mnist_model.py
setup_cifar.py		setup_cifar.py
setup_mnist.py		setup_mnist.py
train.py		train.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Training Ensembles to Detect Adversarial Examples

Requisites

Easy setup

Training an ensemble

Example

Generating adversarial examples

Example

Evaluating an ensemble

Example

About

Uh oh!

Releases

Packages

Languages

License

bagnalla/ensemble_detect_adv

Folders and files

Latest commit

History

Repository files navigation

Training Ensembles to Detect Adversarial Examples

Requisites

Easy setup

Training an ensemble

Example

Generating adversarial examples

Example

Evaluating an ensemble

Example

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages