Add image to image diffusion #417

Kmoneal · 2025-04-22T20:45:37Z

This is very similar to Diffusion but instead of seed takes image of the types specified by the model. For Stable Diffusion, accepted types can be found here.

I'm happy to use this to kick off conversations on this topic as well.

ericluo04 · 2025-05-06T23:14:35Z

Hi @Kmoneal - thank you very much for creating this PR!

Unfortunately, I've been having some trouble getting this implementation to work. Would it be kindly possible for you to share a minimally reproducible example of how to do the following (e.g., with stabilityai/stable-diffusion-3.5-large):

Store the activations of the residual stream (e.g., output of the transformer block at index 24), for any choice/range of timestep.
Intervene on the activations of the above (e.g., by tripling the activation of a particular dimension), for any choice/range of timestep.

Any help would be much appreciated! Thanks again. :)

ericluo04 · 2025-05-07T19:53:50Z

Figured it out! Turns out you can't specify the prompt using prompt = "..." but just have to enter it directly as the first parameter value. See below for extracting the residual stream of the 25th layer (index 24) in stabilityai/stable-diffusion-3.5-large for the first step. Note that init_image is of type PIL.Image.Image.

# transformer block layers
layers = pipe.transformer.transformer_blocks

with pipe.generate("", negative_prompt="", guidance_scale=7.5, 
                   image=init_image, width=832, height=1248,
                   strength=.5, num_inference_steps=4,
                   seed=None):
    # initialize list to store activations
    res_stream = nnsight.list().save() 
    res_stream_text = nnsight.list().save()
    res_stream_image = nnsight.list().save()
    
    # loop over steps, can use layer.all() to extract for all steps
    with layers.iter[0:1]:
        # 24th layer output residual stream for text and image stream
        res_stream.append(layers[24].output)
        # 25th layer input for text stream (to check if same as above)
        res_stream_text.append(layers[25].norm1_context.input)
        # 25th layer input for image stream (to check if same as above)
        res_stream_image.append(layers[25].norm1.input)

nguyentr17 · 2025-12-08T16:39:50Z

Hi, is it not possible to use the current DiffusionModel class for a image to image pipeline like Flux Kontext? I tried and it works for instructpix2pix but for Flux Kontext, it gives me the following error:
ValueError: Cannot return output of Envoy that is not interleaving nor has a fake output set.

Add image to image diffusion

8d4fd17

ericluo04 mentioned this pull request Apr 23, 2025

DiffusionLens Tutorial #415

Closed

ericluo04 mentioned this pull request May 8, 2025

Support for transformer based diffusion models #419

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add image to image diffusion #417

Add image to image diffusion #417

Uh oh!

Kmoneal commented Apr 22, 2025

Uh oh!

ericluo04 commented May 6, 2025

Uh oh!

ericluo04 commented May 7, 2025 •

edited

Loading

Uh oh!

nguyentr17 commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add image to image diffusion #417

Are you sure you want to change the base?

Add image to image diffusion #417

Uh oh!

Conversation

Kmoneal commented Apr 22, 2025

Uh oh!

ericluo04 commented May 6, 2025

Uh oh!

ericluo04 commented May 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nguyentr17 commented Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ericluo04 commented May 7, 2025 •

edited

Loading