Skip to content

Source code for BLenDR Blended Text Embeddings and Diffusion Residuals for Intra-Class Image Synthesis in Deep Metric Learning

License

Notifications You must be signed in to change notification settings

facebookresearch/meta_ai_blender

META AI BLenDeR

BLenDeR: Blended Text Embeddings and Diffusion Residuals for Intra-Class Image Synthesis in Deep Metric Learning

GithubHuggingFaceArXiv

Jan Niklas Kolf 2,3,∗Ozan Tezcan 1Justin Theiss 1Hyung Jun Kim 1Wentao Bao 1Bhargav Bhushanam 1Khushi Gupta 1Arun Kejariwal 1Naser Damer 2,3Fadi Boutros 2

1 Meta Reality Labs   2 Fraunhofer IGD   3 Technical University of Darmstadt

*Work done while interning at Meta Reality Labs


🔖 Summary

This repository provides code for BLenDeR, an inference-time sampling method for personalized diffusion models that combines text embedding interpolation with novel set-theoretic residual set operations (union, intersection, and difference over denoising residuals) to synthesize class-specific concepts with novel attributes.

In the paper, BLenDeR is used to increase intra-class diversity and thereby improve the downstream performance of deep metric learning models.

🔎 Overview

BLenDeR overview.


🚀 Getting Started

Installation

We use conda for Python package management. Run the provided env_setup.sh to set up the environment:

bash env_setup.sh
conda activate blender

Pretrained weights

We provide pretrained LoRA weights and Textual Inversion embeddings for Stable Diffusion 1.5 to synthesize bird images. Our example script generate_with_blendr.py shows how to load the pretrained weights and Textual Inversion embeddings.

The required text token to use in a prompt can be obtained from learned_embeds-steps-20000.bin:

ti_embeddings = torch.load("learned_embeds-steps-20000.bin")
ti_token_strs = list(ti_embeddings["learned_embeds_dict"].keys())
print(ti_token_strs)  # e.g. ['<bird_001>', '<bird_002>', ...]

The base prompt used to synthesize an image is a photo of a <TI Token Str> bird..

Data Structure

The expected structure of the dataset is <dataset-folder>/{train, test}/<class-folder>, where each <class-folder> contains a metadata.jsonl file (required keys: file_name, class_name, class_label) and the corresponding image files, following the Huggingface dataset format.

Preprocessing

First, extract image annotations using utils/preprocessing/get_llava_detailed_description.py.

Then, calculate pairwise image annotation similarities which are required for data generation using utils/preprocessing/compute_llava_prompt_similarities.py.

Run data generation

To run data generation after setup with either union or difference residual set operations, run generate_with_blendr.py.


📄 License

This project is licensed under CC-BY-NC 4.0.


📚 Citing BLenDR

If you find this repository useful, please consider giving a ⭐ and citing:

@article{kolf2026blender,
  title={BLenDeR: Blended Text Embeddings and Diffusion Residuals for Intra-Class Image Synthesis in Deep Metric Learning},
  author={Kolf, Jan Niklas and Tezcan, Ozan and Theiss, Justin and Kim, Hyung Jun and Bao, Wentao and Bhushanam, Bhargav and Gupta, Khushi and Kejariwal, Arun and Damer, Naser and Boutros, Fadi},
  journal={arXiv preprint arXiv:2601.20246},
  year={2026}
}

About

Source code for BLenDR Blended Text Embeddings and Diffusion Residuals for Intra-Class Image Synthesis in Deep Metric Learning

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published