AnyBench and AnySD

This is the official model implementation and benchmark evaluation repository of AnyEdit: Unified High-Quality Image Edit with Any Idea

🚀 Quick Start

Clone this repo

git clone https://github.com/weichow23/AnySD

Environment setup

conda create -n anyedit python=3.9.2
conda activate anyedit
pip install -r requirements.txt
pip install --upgrade torch diffusers xformers triton pydantic deepspeed
pip install git+https://github.com/openai/CLIP torchmetrics==0.5

For AnyBench you need to

bash anybench/setup.sh  # You need to go into the script and carefully check to ensure that the correct dependencies are installed.

📊 AnyBench

This is the guide for the evaluation tool for AnyBench. The specific files are located in the anybench directory.

We have integrated the evaluations for AnyBench, Emu-edit, and MagicBrush into the same codebase, and it supports the following models: Null-Text, Uni-ControlNet, InstructPix2Pix, MagicBrush, HIVE, and UltraEdit (SD3).

Evaluation metrics are CLIPim↑, CLIPout↑, L1↓ , L2↓and DINO↑

🏆 Evaluation

EMU-Edit

CUDA_VISIBLE_DEVICES=7 PYTHONPATH='./' python3 anybench/eval/emu_gen_eval.py

It is worth noting that the emu-edit test actually uses the validation set from the Hugging Face repository facebook/emu_edit_test_set_generations. This point has been discussed in previous work here.

MagicBrush

download the test set from MagicBrush and unzip it in anybench/dataset/magicbrush

CUDA_VISIBLE_DEVICES=7 PYTHONPATH='./' python3 anybench/eval/magicbrush_gen_eval.py

AnyBench

download the AnyBench-test

cd anybench/dataset/
gdown 1V-Z4agWoTMzAYkRJQ1BNz0-i79eAVWt4
sudo apt install unzip
unzip AnyEdit-Test.zip

CUDA_VISIBLE_DEVICES=0 PYTHONPATH='./' python3 anybench/eval/anybench_gen_eval.py

⚠ Notice: AnySD may output completely black images for certain sensitive commands, which is a normal occurrence.

⚠ Notice: During evaluation, the final scores may vary due to the influence of inference hyperparameters, random seeds, and batch size.

🎨 AnySD

🌐 Inference

We sorted out the AnyEdit data when we released it to the public. To adapt the sorted model, we retrained the model, so the results will be slightly different from those in the paper, but the general results are similar. And the hyperparameters also have a greater impact on the results.

CUDA_VISIBLE_DEVICES=0 PYTHONPATH='./' python3 anysd/infer.py

🔮 Training

Prepare Data

huggingface-cli download Bin1117/anyedit-split --repo-type dataset

Stage I

bash train_stage1.sh

Stage II

# before training, you should download anybench-test as it is the validation set
cd anybench/dataset/
gdown 1w_QsjDvNp-c9R1gaT5lex0esQAPRE1AQ
sudo apt install unzip
unzip AnyEdit-Test.zip

The experts included in AnySD are as follows

# TYPE = ['visual_ref', 'visual_ske', 'visual_scr', 'visual_bbox', 'visual_mat', 'visual_seg', 'visual_dep', 'viewpoint', 'global']
bash train_stage2.sh

🔎 Summary

Since AnyEdit contains a wide range of editing instructions across various domains, it holds promising potential for developing a powerful editing model to address high-quality editing tasks. However, training such a model has three extra challenges: (a) aligning the semantics of various multi-modal inputs; (b) identifying the semantic edits within each domain to control the granularity and scope of the edits; (c) coordinating the complexity of various editing tasks to prevent catastrophic forgetting. To this end, we propose a novel AnyEdit Stable Diffusion approach (🎨AnySD) to cope with various editing tasks in the real world.

Architecture of 🎨AnySD. 🎨AnySD is a novel architecture that supports three conditions (original image, editing instruction, visual prompt) for various editing tasks.

💖 Our model is based on the awesome SD 1.5

📚 Citation

@article{yu2024anyedit,
  title={AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea},
  author={Yu, Qifan and Chow, Wei and Yue, Zhongqi and Pan, Kaihang and Wu, Yang and Wan, Xiaoyang and Li, Juncheng and Tang, Siliang and Zhang, Hanwang and Zhuang, Yueting},
  journal={arXiv preprint arXiv:2411.15738},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
anybench		anybench
anysd		anysd
assets		assets
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
train_stage1.sh		train_stage1.sh
train_stage2.sh		train_stage2.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AnyBench and AnySD

🚀 Quick Start

📊 AnyBench

🏆 Evaluation

🎨 AnySD

🌐 Inference

🔮 Training

🔎 Summary

📚 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

weichow23/AnySD

Folders and files

Latest commit

History

Repository files navigation

AnyBench and AnySD

🚀 Quick Start

📊 AnyBench

🏆 Evaluation

🎨 AnySD

🌐 Inference

🔮 Training

🔎 Summary

📚 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages