Skip to content

Westlake-AGI-Lab/SafetyBPO

Repository files navigation

SafetyBPO: Bidirectional Preference Optimization for Safe Text-to-Image Generation

SafetyBPO HuggingFace HuggingFace

Methodology

SafetyBPO

SafetyBPO is a novel diffusion unlearning framework that introduces Bidirectional Preference Optimization (BPO) to reformulate safety alignment, enabling fine-grained control of generative output through dual-view supervision and positive–negative guidance.

Usage

Installation

Create and activate a conda environment:

conda create -n safetybpo python=3.10
conda activate safetybpo

Install the required packages:

pip install -r requirements.txt

Training

bash train.sh

Inference

python inference.py \
    --pos_model_path 'real-outputs/pos' \
    --neg_model_path 'real-outputs/neg' \
    --save_path /save_path \
    --prompts_path /prompts_path

Evaluation

InPro

Evaluate harmful content suppression:

Step 1. Please follow Q16 and generate the Q16 results.

Step 2. Run the following commands with your IMAGE_PATH and Q16_PATH.

python test.py \
    --metrics 'inpro' \
    --target_folder IMAGE_PATH \
    --reference /Q16_PATH/sim_prompt_tuneddata/inappropriate_images.csv 

FID

Evaluate image fidelity:

python test.py \
    --metrics 'fid' \
    --target_folder IMAGE_PATH \
    --reference REFERENCE_IMAGE_PATH

CLIP

Evaluate text alignment:

python test.py \
    --metrics 'clip' \
    --target_folder IMAGE_PATH \
    --reference PROMPT_PATH

Acknowledgement

This project is built upon the Diffusion-DPO and Diffusion-NPO .

Citation

If our work is useful for your research, please consider citing:

@inproceedings{wu2026safetybpo,
  title={SafetyBPO: Bidirectional Preference Optimization for Safe Text-to-Image Generation},
  author={Wu, You and Zhu, Beier and Zhang, Chi},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings},
  pages={4759--4768},
  year={2026}
}

About

[CVPR 2026 Findings] SafetyBPO: Bidirectional Preference Optimization for Safe Text-to-Image Generation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors