Skip to content

Commit 54df103

Browse files
committed
Update README.md
1 parent 1866723 commit 54df103

File tree

1 file changed

+8
-1
lines changed

1 file changed

+8
-1
lines changed

README.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,14 @@ pip install -r requirements_qwen.txt
2424

2525
### Perturbation-based Attacks
2626

27-
We provide adversarial images for perturbation-based attack setups in `./datasets/adv_img_*`
27+
#### Preparing adversarial images
28+
29+
**If you want to generate adversarial images on your own**, please refer to [this excellent repo](https://github.com/Unispac/Visual-Adversarial-Examples-Jailbreak-Large-Language-Models) which provides the code to attack Llava and MiniGPT-4. If you need code to attack Qwen2-VL, please send me an email ([email protected]).
30+
31+
In constructing the steering vectors, we generate universal adversarial images using PGD without explicitly providing malicious instructions. During evaluation, our defense approach could effectively defend perturbation-based attacks regardless of whether attackers include malicious instructions during the adversarial image generation process.
32+
33+
34+
**If you want to follow the setup in our paper**, we have already provided adversarial images for perturbation-based attack setups in `./datasets/adv_img_*`.
2835

2936
#### Toxicity
3037

0 commit comments

Comments
 (0)