LLaDA2.0-Uni ComfyUI Nodes

Custom ComfyUI nodes for LLaDA 2.0-Uni — a unified multimodal diffusion language model supporting text-to-image generation, image understanding (VQA), and image editing.

Installation

⚠️ These nodes depend on the encoder/ and decoder/ modules in the project root. Do not copy apps/comfyui in isolation — the full repository must be present and the relative path apps/comfyui must be preserved.

Option 1: Clone + symlink (recommended)

# 1. Clone the full project
git clone https://github.com/inclusionAI/LLaDA2.0-Uni.git

# 2. Symlink into ComfyUI's custom_nodes
cd /path/to/ComfyUI/custom_nodes
ln -s /path/to/LLaDA2.0-Uni/apps/comfyui ./LLaDA2Uni

Option 2: One-line installer

bash /path/to/LLaDA2.0-Uni/apps/comfyui/install.sh /path/to/ComfyUI

Dependencies

pip install -r apps/comfyui/requirements.txt
pip install flash-attn --no-build-isolation  # optional, recommended

Model Weights

In the Loader node, set the model path to either a HuggingFace repo ID or a local directory:

HuggingFace (auto-download):

inclusionAI/LLaDA2.0-Uni

Local path:

/path/to/LLaDA2.0-Uni

Expected directory layout:

LLaDA2.0-Uni/
├── config.json                       # LLM config
├── model-*.safetensors               # LLM weights
├── tokenizer.json
├── decoder/
│   ├── config.json
│   └── model.safetensors             # diffusion decoder
├── decoder-turbo/
│   ├── config.json
│   └── model.safetensors             # turbo decoder (8-step)
├── vae/
│   └── diffusion_pytorch_model.safetensors
└── image_tokenizer/
    ├── config.json
    ├── preprocessor_config.json
    ├── model.safetensors             # SigLIP-VQ weights
    └── sigvq_embedding.pt

Nodes

Node	Description
LLaDA2.0_Uni Loader	Load the model (Flash Attention / SDPA, optional CPU offload)
LLaDA2.0_Uni Text-to-Image	Generate VQ image tokens from a text prompt (supports thinking mode)
LLaDA2.0_Uni Image Understanding	Visual question answering
LLaDA2.0_Uni Image Editing	Edit an image with a text instruction
LLaDA2.0_Uni Token Decoder	Decode VQ tokens to pixels (turbo or normal mode)
LLaDA2.0_Uni Unload Model	Manually free VRAM

Example Workflows

Text-to-Image

Loader → Text-to-Image → Token Decoder → Preview Image

Image Understanding

Load Image + Loader → Image Understanding → Show Text

Image Editing

Load Image + Loader → Image Editing → Token Decoder → Preview Image

Parameters

Loader

model_path — HuggingFace repo ID or local directory
attention — flash_attn (recommended) or sdpa
dtype — bf16 (recommended) or fp8
offload — enable CPU offload for limited VRAM
device — cuda or cpu

Text-to-Image

prompt — text description
width / height — output resolution
steps — LLM denoising steps (8–32)
cfg_scale — classifier-free guidance scale
mode — standard or thinking
seed — random seed (-1 = random)
block_length — block size for block-wise denoising

Token Decoder

decode_mode — decoder-turbo (fast, 8 steps) or normal (50 steps)
decoder_steps — number of steps when using normal mode
resolution_multiplier — upscale factor (typically 2)
unload_after — release decoder VRAM after decoding (set False to keep cached for faster repeated decodes)

License

Same as the parent project. See the repository root for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLaDA2.0-Uni ComfyUI Nodes

Installation

Option 1: Clone + symlink (recommended)

Option 2: One-line installer

Dependencies

Model Weights

Nodes

Example Workflows

Text-to-Image

Image Understanding

Image Editing

Parameters

Loader

Text-to-Image

Token Decoder

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

LLaDA2.0-Uni ComfyUI Nodes

Installation

Option 1: Clone + symlink (recommended)

Option 2: One-line installer

Dependencies

Model Weights

Nodes

Example Workflows

Text-to-Image

Image Understanding

Image Editing

Parameters

Loader

Text-to-Image

Token Decoder

License