Custom ComfyUI nodes for LLaDA 2.0-Uni — a unified multimodal diffusion language model supporting text-to-image generation, image understanding (VQA), and image editing.
⚠️ These nodes depend on theencoder/anddecoder/modules in the project root. Do not copyapps/comfyuiin isolation — the full repository must be present and the relative pathapps/comfyuimust be preserved.
# 1. Clone the full project
git clone https://github.com/inclusionAI/LLaDA2.0-Uni.git
# 2. Symlink into ComfyUI's custom_nodes
cd /path/to/ComfyUI/custom_nodes
ln -s /path/to/LLaDA2.0-Uni/apps/comfyui ./LLaDA2Unibash /path/to/LLaDA2.0-Uni/apps/comfyui/install.sh /path/to/ComfyUIpip install -r apps/comfyui/requirements.txt
pip install flash-attn --no-build-isolation # optional, recommendedIn the Loader node, set the model path to either a HuggingFace repo ID or a local directory:
HuggingFace (auto-download):
inclusionAI/LLaDA2.0-Uni
Local path:
/path/to/LLaDA2.0-Uni
Expected directory layout:
LLaDA2.0-Uni/
├── config.json # LLM config
├── model-*.safetensors # LLM weights
├── tokenizer.json
├── decoder/
│ ├── config.json
│ └── model.safetensors # diffusion decoder
├── decoder-turbo/
│ ├── config.json
│ └── model.safetensors # turbo decoder (8-step)
├── vae/
│ └── diffusion_pytorch_model.safetensors
└── image_tokenizer/
├── config.json
├── preprocessor_config.json
├── model.safetensors # SigLIP-VQ weights
└── sigvq_embedding.pt
| Node | Description |
|---|---|
| LLaDA2.0_Uni Loader | Load the model (Flash Attention / SDPA, optional CPU offload) |
| LLaDA2.0_Uni Text-to-Image | Generate VQ image tokens from a text prompt (supports thinking mode) |
| LLaDA2.0_Uni Image Understanding | Visual question answering |
| LLaDA2.0_Uni Image Editing | Edit an image with a text instruction |
| LLaDA2.0_Uni Token Decoder | Decode VQ tokens to pixels (turbo or normal mode) |
| LLaDA2.0_Uni Unload Model | Manually free VRAM |
Loader → Text-to-Image → Token Decoder → Preview Image
Load Image + Loader → Image Understanding → Show Text
Load Image + Loader → Image Editing → Token Decoder → Preview Image
model_path— HuggingFace repo ID or local directoryattention—flash_attn(recommended) orsdpadtype—bf16(recommended) orfp8offload— enable CPU offload for limited VRAMdevice—cudaorcpu
prompt— text descriptionwidth/height— output resolutionsteps— LLM denoising steps (8–32)cfg_scale— classifier-free guidance scalemode—standardorthinkingseed— random seed (-1= random)block_length— block size for block-wise denoising
decode_mode—decoder-turbo(fast, 8 steps) ornormal(50 steps)decoder_steps— number of steps when usingnormalmoderesolution_multiplier— upscale factor (typically2)unload_after— release decoder VRAM after decoding (setFalseto keep cached for faster repeated decodes)
Same as the parent project. See the repository root for details.