The Brush of Spells: Text-guided Image Inpainting

A deep learning project inspired by the research paper "MMFL: Multimodal Fusion Learning for Text-Guided Image Inpainting", enabling users to restore and fill images guided by natural language descriptions. This repository provides an interactive interface for text-guided image inpainting, allowing users to mask regions and describe how they should be filled in.

Description

The Brush of Spells is a text-guided image inpainting tool that leverages multimodal fusion learning to restore or edit images based on user-provided text prompts. Inspired by the MMFL paper, this project fuses image and text features to generate contextually accurate and semantically meaningful inpainted results. The system is designed for artists, researchers, and developers interested in advanced image editing using natural language.

Solution Overview

The project implements a multimodal approach for image inpainting, integrating both visual information (the masked image) and textual guidance (user prompt). The core idea, as proposed in MMFL, is to imitate a painter’s conjecture process: the model uses the text description to provide abundant guidance for image restoration, fusing multimodal features to generate plausible and context-aware inpainting results.

Workflow:

User uploads an image and gives the width and height of the mask region to be edited.
User provides a text prompt describing the desired content for the masked area.
The system fuses the image and text features using a multimodal deep learning model.
The masked region is filled in according to the prompt, producing a visually coherent and semantically relevant output.

Features

Text-guided inpainting: Restore or edit images by describing changes in natural language.
Interactive interface: Upload images, draw masks, and enter prompts in a user-friendly web app.
Multimodal fusion: Combines visual and textual cues for context-aware restoration.
State-of-the-art results: Inspired by MMFL and recent advances in diffusion-based inpainting models.

Interface & Results

Below are screenshots and sample results from the interactive interface:

Original Image	Masked Image	Prompt	Inpainted Result
		"A striking Baltimore Oriole perched on a branch, showcasing its vibrant orange and black plumage."
		"A vibrant Sun Conure parrot with an orange head and chest, yellow back and wings, and green-blue tail feathers is perched on a light-colored cylindrical bar against a softly blurred beige background."
		"A bright red Northern Cardinal with a pointed crest and orange beak, perched on a branch against a softly blurred green background."

Interface Screenshot:

Installation

Clone this repository:

git clone https://github.com/IEEE-NITK/text-guided-image-inpainting.git
cd text-guided-image-inpainting

Install dependencies:

pip install -r demo/requirements.txt

Usage

To launch the interactive inpainting interface locally:

python demo/app.py

How to use:

Upload an image.
Select the required height and width of the mask.
Enter a text prompt describing the desired content.
Click "Generate Inpainted Image" to generate the result.

The interface is built with Streamlit for ease of use and rapid prototyping.

References

Lin, Qing, et al. "MMFL: Multimodal Fusion Learning for Text-Guided Image Inpainting." Proceedings of the 28th ACM International Conference on Multimedia, 2020.

Contributions and feedback are welcome!

“This paper imitates the process of painters' conjecture, and proposes to introduce the text description into the image inpainting task for the first time, which provides abundant guidance information for image restoration through the fusion of multimodal features.”

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
demo		demo
README.md		README.md
tbos-almr-wgan.ipynb		tbos-almr-wgan.ipynb
tbos-basepaper-wgan-with-attention-maps.ipynb		tbos-basepaper-wgan-with-attention-maps.ipynb
tbos-basepaper-with-wgan.ipynb		tbos-basepaper-with-wgan.ipynb
tbos-impainting-pretrained-diffusion-model.ipynb		tbos-impainting-pretrained-diffusion-model.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

The Brush of Spells: Text-guided Image Inpainting

Table of Contents

Description

Solution Overview

Features

Interface & Results

Installation

Usage

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

IEEE-NITK/text-guided-image-inpainting

Folders and files

Latest commit

History

Repository files navigation

The Brush of Spells: Text-guided Image Inpainting

Table of Contents

Description

Solution Overview

Features

Interface & Results

Installation

Usage

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages