StyleMerge Diffusion: A training-free approach to prompted and artistically accurate image generation

About

StyleMerge Diffusion achieves artistically accurate image generation by transferring visual style from reference images to text-prompted content.Built on Stable Diffusion 2.1 and utilizing the diffusers library huggingface/diffusers, It does so without fine-tuning/extra training via Attention key injection and Semantic Alignment via Initial Latent Noise Optimization.

Pipeline of our Method:

Samples generated with StyleMergeDiffusion

Usage

To run our code, please follow these steps:

Setup
Run StyleMerge
Evaluation

System Requirements

GPU with 16GB VRAM (FP16 precision)
Our implementation utilizes diffusers library huggingface/diffusers

Setup

Our codebase is built on (Jiwoogit/StyleID and Xiefan-guo/initno).

Install the packages in native or virtual env

pip install -r requirements.txt

Run StyleMerge

For running StyleMerge, run:

cd diffusers_implementation/

cd diffusers_implementation/
python3 run_styleid_diffusers.py \
  --style_prompt None \
  --gamma 0.9 \
  --start 0 \
  --timestep_thr 376 \
  --ddim_steps 40 \
  --save_dir ./output \
  --sty_fn './data_vis/sty/flowersanime.png' \
  --prompt "a rabbit and a turtle" \
  --seed 42 \
  --token_indices [2,5] \
  --initno

To fine-tune the parameters, you have control over the following aspects in the style transfer:

Timestep of style injection is controlled by the --timestep_thr parameter.
Attention-based style injection is removed by the --without_attn_injection parameter.
Query preservation is controlled by the --gamma parameter. (A higher value enhances content fidelity but may result a lack of style fidelity).
Attention temperature scaling is controlled through the --T parameter.
Removal of initial latent AdaIN normalization is controlled by the --without_init_adain parameter.

Evaluation

For a quantitative evaluation, we incorporated the CMMD evaluation metric that offers a more complete metric than FID sayakpaul/cmmd-pytorch and also FID.

CMMD

run:

cd evaluation
python3 ./cmmd-pytorch/main.py \
  ./cmmd-pytorch/reference_images/pixelart/ \
  ../results/flowersanime \
  --batch_size=32 \
  --max_count=30000

FID

run:

cd evaluation;
./evaluation/eval.sh

Acknowledgements

The authors gratefully acknowledge the use of resources (GPU Cluster) provided by the National Infrastructures for Research and Technology (GRNET).

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
assets		assets
config		config
data_vis/sty		data_vis/sty
diffusers_implementation		diffusers_implementation
evaluation		evaluation
util		util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StyleMerge Diffusion: A training-free approach to prompted and artistically accurate image generation

About

Pipeline of our Method:

Samples generated with StyleMergeDiffusion

Usage

System Requirements

Setup

Install the packages in native or virtual env

Run StyleMerge

Evaluation

CMMD

FID

Acknowledgements

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

StyleMerge Diffusion: A training-free approach to prompted and artistically accurate image generation

About

Pipeline of our Method:

Samples generated with StyleMergeDiffusion

Usage

System Requirements

Setup

Install the packages in native or virtual env

Run StyleMerge

Evaluation

CMMD

FID

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages