Support Dirty-Label Backdoor Attack

Add support in Armory Library for an undefended [Dirty-label Backdoor (DLBD) Attack](https://arxiv.org/abs/1708.06733) applied to image classification. 

In a DLBD attack, training images are chosen from the source class, a trigger applied to them, and then their labels flipped to the target class. The model is then trained on this modified data. The adversary's goal is that test images from the source class will be classified as the target class when the trigger is applied at test time.

Four primary metrics are computed after the model is trained on poisoned data.
- Accuracy on benign test data, all classes
- Accuracy on benign test data, source class
- Accuracy on poisoned test data, all classes
- Attack success rate

To evaluate a DLBD attack, Armory Library must
- Create poison datasets by inserting triggers into selected classes and modifying labels;
- Generate primary poisoning metrics to evaluate a poisoned model;
- Run an example script evaluating a DLBD attack using the CIFAR10 dataset and a ResNet-18 classifier.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support Dirty-Label Backdoor Attack #137

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support Dirty-Label Backdoor Attack #137

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions