Text-Adversarial-Attack

This repo contains the project for the course "Natural Language Processing" of University of Padova. The project consists in implementing 3 different Black-Box Text Adversarial Attacks algorithms, using the library TextAttack, and evaluating them under various aspects such as attack effectiveness, adversarial semantic coherence, transferability and adversarial training. The considered Adversarial Algorithms are:

We performed the experiments in the text classification context againts BERT-based classifiers with high test accuracy downloaded from the HuggingFace hub. The considered datasets, on which the models were pre-trained, are:

IMDB
Yelp polarity
AG News

Attack Effectiveness, Semantic Similarity and Transferability

To replicate our results, it is sufficient to execute the notebook named adversarial_attacks.ipynb from the first to the last cell.

In the BESA part, it may happen that the computation takes a lot of time. In order to overcome this problem, you can split the computation across different machines. The idea is to run one experiment with the first 32 examples on 1 PC, and the other 32 on another machine. You can do this by modifying the code in the Dataset section as follows:

On the first machine, copy the following code (adapting it for the involved dataset):

imdb = FixedHuggingFaceDataset("imdb", split="test", subset_size=32, shuffle=True)

On the second machine:

imdb = FixedHuggingFaceDataset("imdb", split="test", subset_size=32, shuffle=True, offset=32)

Run the experiments in parallel on the two machines and average the results at the end.

DISCLAIMER: If you split the computation, the plots will take into consideration only half of examples.

Adversarial Training

To replicate our results, it is sufficient to execute the notebooks adversarial_training_BAEGarg.ipynb and adversarial_training_DeepWordBug.ipynb from the first to the last cell.

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
.gitignore		.gitignore
Black Box Adversarial Attacks on Text Classification.pdf		Black Box Adversarial Attacks on Text Classification.pdf
README.md		README.md
adversarial_attacks.ipynb		adversarial_attacks.ipynb
adversarial_training_BAEGarg.ipynb		adversarial_training_BAEGarg.ipynb
adversarial_training_DeepWordBug.ipynb		adversarial_training_DeepWordBug.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Text-Adversarial-Attack

Attack Effectiveness, Semantic Similarity and Transferability

Adversarial Training

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

riccardocappi/Text-Adversarial-Attack

Folders and files

Latest commit

History

Repository files navigation

Text-Adversarial-Attack

Attack Effectiveness, Semantic Similarity and Transferability

Adversarial Training

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages