Attention Relation Graph Distillation

We have already uploaded the all2one pretrained backdoor student model(i.e. gridTrigger WRN-16-1, target label 0) and the clean teacher model(i.e. WRN-16-1) in the path of ./weight/s_net and ./weight/t_net respectively.

For evaluating the performance of ARGD, you can easily run command:

$ python main-ARGD.py

where the default parameters are shown in config.py.

The trained model will be saved at the path weight/erasing_net/<s_name>.tar

Please carefully read the main.py and configs.py, then change the parameters for your experiment.

Erasing Results on BadNets under 5% clean data ratio

Dataset	Baseline ACC	Baseline ASR	ARGD ACC	ARGD ASR
CIFAR-10	80.08	100.0	79.81	2.10

Training your own backdoored model

We have provided a DatasetBD Class in data_loader.py for generating training set of different backdoor attacks.

For implementing backdoor attack(e.g. GridTrigger attack), you can run the below command:

$ python train_badnet.py

This command will train the backdoored model and print clean accuracies and attack rate. You can also select the other backdoor triggers reported in the paper.

Please carefully read the train_badnet.py and configs.py, then change the parameters for your experiment.

How to get teacher model?

we obtained the teacher model by finetuning all layers of the backdoored model using 5% clean data with data augmentation techniques. In our paper, we only finetuning the backdoored model for 5~10 epochs. Please check more details of our experimental settings in section 4.1; The finetuning code is easy to get by just use the cls_loss to train it, which means the distillation loss to be zero in the training process.

Other source of backdoor attacks

Attack

CL: Clean-label backdoor attacks

SIG: A New Backdoor Attack in CNNS by Training Set Corruption Without Label Poisoning

Paper

Refool: Reflection Backdoor: A Natural Backdoor Attack on Deep Neural Networks

Defense

MCR: Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness

**Fine-tuning **: Defending Against Backdooring Attacks on Deep Neural Networks

Pytorch implementation1

**Neural Attention Distillation **: Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks

Pytorch implementation1

Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks

STRIP: A Defence Against Trojan Attacks on Deep Neural Networks

Library

Note: TrojanZoo provides a universal pytorch platform to conduct security researches (especially backdoor attacks/defenses) of image classification in deep learning.

Backdoors 101 — is a PyTorch framework for state-of-the-art backdoor defenses and attacks on deep learning models.

References

If you find this code is useful for your research, please cite our paper.

@inproceedings{ijcai2022p206, title = {Eliminating Backdoor Triggers for Deep Neural Networks Using Attention Relation Graph Distillation}, author = {Xia, Jun and Wang, Ting and Ding, Jiepin and Wei, Xian and Chen, Mingsong}, booktitle = {Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, {IJCAI-22}}, publisher = {International Joint Conferences on Artificial Intelligence Organization}, editor = {Lud De Raedt}, pages = {1481--1487}, year = {2022}, month = {7}, note = {Main Track}, doi = {10.24963/ijcai.2022/206}, url = {https://doi.org/10.24963/ijcai.2022/206}, }

Contacts

If you have any questions, leave a message below with GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
models		models
trigger		trigger
utils		utils
weight		weight
License		License
README.md		README.md
at.py		at.py
config.py		config.py
data_loader.py		data_loader.py
main-ARGD.py		main-ARGD.py
riman_distance.py		riman_distance.py
train.py		train.py
train_badnet.py		train_badnet.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Attention Relation Graph Distillation

Erasing Results on BadNets under 5% clean data ratio

Training your own backdoored model

How to get teacher model?

Other source of backdoor attacks

Attack

Defense

Library

References

Contacts

About

Releases

Packages

Languages

License

BililiCode/ARGD

Folders and files

Latest commit

History

Repository files navigation

Attention Relation Graph Distillation

Erasing Results on BadNets under 5% clean data ratio

Training your own backdoored model

How to get teacher model?

Other source of backdoor attacks

Attack

Defense

Library

References

Contacts

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages