Skip to content

[Roadmap] Diffusion LLMs (2026 S1) #14199

@ClawSeven

Description

@ClawSeven

Checklist

Motivation

Earlier this year, LLaDA released the first diffusion LLM (dLLM), immediately capturing significant attention from both the academic and industrial communities. But there were no production-ready dLLM serving engine.

We plan to implement the most performant, production-ready dLLM framework in SGLang, make dLLM robust !

Features

For RL

VL-dLLM

  • Initial multi-modal LLM implementation @btw616

More supported models

More Hardwares

More Parallelism

  • Tensor parallelism
  • Expert parallelism
  • Data Parallelism (with DPA)
  • Context Parallelism
  • Pipeline parallelism

Kernel Optimization for dLLM

More disaggregation

PD is not suitable for dLLM, but AFD might be a viable option.

More Tests

  • Small unit tests for specific functions
  • Nightly unit tests for E2E accuracy and throughput testing

Better streaming output

  • Support diffusion-style streaming output (like Mercury)

RFC

#12766

Related resources

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions