Skip to content

Tracking issue for SD ecosystem feature parity #69

Open
@Keavon

Description

@Keavon

The intention for this issue is to provide a comprehensive outline of all the core features and capabilities other distributions of Stable Diffusion (primarily A1111) provide. It's a big list, but not all are nearly as high priority as others. Some items in this outline will be turned into GitHub issues for discussing and tracking progress on implementation. Please comment on this issue to suggest additions, clarifications, and sub-features and I'll aim to keep the outline up to date.

Generation methods

  • Txt2img
  • Img2img
    • In/outpainting
      • Choice of starting with existing image, smeared surrounding colors, latent noise, and latent nothing
    • Denoising strength (this is already implemented?)
  • Depth2Img (via txt2img and img2img)
  • Regional prompts/latent couple/two shot diffusion (a unique prompt per grid area, like the left half and right half of the image)

Generation parameters

  • Viewing the image generation progress as it runs (this is very high priority for Graphite)
  • Negative prompts
  • CFG scale (is this already implemented?)
  • Non-square multiple-of-64 resolutions
    • Widths and heights as multiples of 8 instead of 64
  • Infinite prompt token length
  • Multiple prompts (like space ship (sci-fi) vs. space AND ship (sailing ship in space))
  • Prompt token weighting (like (beautiful:1.5) tree (with autumn leaves:0.8))
  • Seed resize (pin a seed and its resolution, then generate at a different resolution or aspect ratio and keep mostly the same image)

Model support

  • Stable Diffusion model formats
    • SD 1.4 (is this already implemented?)
    • SD 1.5
    • SD 2.0 (is this already implemented?)
    • SD 2.1
    • SDXL
  • Inpainting-specific models
    • "Inpainting conditioning mask strength" parameter
  • Instruct-Pix2Pix
  • Custom checkpoints/models

Stylization

  • LoRA
  • Hypernetworks
  • Textual Inversion
  • Dreambooth
  • Swappable VAEs (is this already implemented?)

ControlNet

Some features are described at https://github.com/Mikubill/sd-webui-controlnet but I don't currently have time to make a list of them. Help with such a list would be appreciated.

Optimizations

VRAM reduction strategies, things like xformers and floating point precision? I don't understand this stuff enough to really get it. Also other methods will remove certain parts of the pipeline from VRAM after that stage has been completed which trades time for VRAM requirements. I'll need help creating a list of out this.

Upscaling

Some upscalers are entirely separate models and are thus likely out of scope. Other upscalers, I think, are part of the SD pipeline. Some are scripts, but I think others are actual models which require being implemented in the actual pipeline? Those ones should probably be included here, but I need help creating a list.

Sampling methods

  • Euler a
  • Euler
  • LMS
  • Heun
  • DPM2
  • DPM2 a
  • DPM++ 2S a
  • DPM++ 2M
  • DPM++ SDE
  • DPM fast
  • DPM adaptive
  • LMS Karras
  • DPM2 Karras
  • DPM2 a Karras
  • DPM++ 2S a Karras
  • DPM++ 2M Karras
  • DPM++ SDE Karras
  • DDIM
  • PLMS
  • UniPC

Other models

  • Upscaling (ESRGAN, etc.)
  • CLIP interrogator
  • Restore faces (GFPGAN, CodeFormer)
  • (probably more?)

Did I miss something? Probably! Hopefully the community can help me keep this list updated so it's as comprehensive as possible. Thanks ❤️.

Ideally these capabilities would be modular, allowing for composability and opting in and out of specific features at will for any desired image generation pipeline. In our use case with Graphite, we want to put different options into nodes within a node graph so they are user-configurable. (I should also mention that keeping the MIT/Apache 2.0 license is important for Graphite, since our project is also Apache 2.0, so I'd humbly request that some care be taken to not copy from copyleft code which would force this library to change its license, thanks 😃).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions