yingguo-trt · pull · Apr 4, 2026 · Apr 3, 2026 · Apr 3, 2026 · Apr 4, 2026
diff --git a/docs/source/commands/trtllm-serve/trtllm-serve.rst b/docs/source/commands/trtllm-serve/trtllm-serve.rst
@@ -218,7 +218,7 @@ Visual Generation Serving
 ``trtllm-serve`` supports diffusion-based visual generation models (FLUX.1, FLUX.2, Wan2.1, Wan2.2) for image and video generation. When a diffusion model directory is provided (detected by the presence of ``model_index.json``), the server automatically launches in visual generation mode with dedicated endpoints.
 
 .. note::
-   VisualGen is in **prototype** stage. APIs, supported models, and optimization options are actively evolving and may change in future releases.
+   VisualGen is in **beta** stage. APIs, supported models, and optimization options are actively evolving and may change in future releases.
 
 .. code-block:: bash
 

diff --git a/docs/source/models/supported-models.md b/docs/source/models/supported-models.md
@@ -87,4 +87,31 @@ Note:
 
 # Visual Generation Models
 
-For diffusion-based image and video generation models, see the [Visual Generation](./visual-generation.md) documentation.
+TensorRT-LLM provides beta support for diffusion-based image and video generation.
+For full documentation, see the [Visual Generation](./visual-generation.md) page.
+
+## Supported Models
+
+| HuggingFace Model ID | Tasks |
+|---|---|
+| `black-forest-labs/FLUX.1-dev` | Text-to-Image |
+| `black-forest-labs/FLUX.2-dev` | Text-to-Image |
+| `Wan-AI/Wan2.1-T2V-1.3B-Diffusers` | Text-to-Video |
+| `Wan-AI/Wan2.1-T2V-14B-Diffusers` | Text-to-Video |
+| `Wan-AI/Wan2.1-I2V-14B-480P-Diffusers` | Image-to-Video |
+| `Wan-AI/Wan2.1-I2V-14B-720P-Diffusers` | Image-to-Video |
+| `Wan-AI/Wan2.2-T2V-A14B-Diffusers` | Text-to-Video |
+| `Wan-AI/Wan2.2-I2V-A14B-Diffusers` | Image-to-Video |
+| `Lightricks/LTX-2` | Text-to-Video (with Audio), Image-to-Video (with Audio) |
+
+## Feature Matrix
+
+| Model | TeaCache | CFG Parallelism | Ulysses Parallelism | Parallel VAE | CUDA Graph | torch.compile | trtllm-serve |
+|---|---|---|---|---|---|---|---|
+| **FLUX.1** | Yes | No [^vg1] | Yes | No | Yes | Yes | Yes |
+| **FLUX.2** | Yes | No [^vg1] | Yes | No | Yes | Yes | Yes |
+| **Wan 2.1** | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
+| **Wan 2.2** | No | Yes | Yes | Yes | Yes | Yes | Yes |
+| **LTX-2** | No | Yes | Yes | No | No | Yes | Yes |
+
+[^vg1]: FLUX models use embedded guidance and do not have a separate negative prompt path, so CFG parallelism is not applicable.
diff --git a/docs/source/models/visual-generation.md b/docs/source/models/visual-generation.md
@@ -1,7 +1,7 @@
-# Visual Generation (Prototype)
+# Visual Generation (Beta)
 
 ```{note}
-This feature is in **prototype** stage. APIs, supported models, and optimization options are
+This feature is in **beta** stage. APIs, supported models, and optimization options are
 actively evolving and may change in future releases.
 ```
 
@@ -30,7 +30,7 @@ TensorRT-LLM **VisualGen** provides a unified inference stack for diffusion mode
 | `Wan-AI/Wan2.1-I2V-14B-720P-Diffusers` | Image-to-Video |
 | `Wan-AI/Wan2.2-T2V-A14B-Diffusers` | Text-to-Video |
 | `Wan-AI/Wan2.2-I2V-A14B-Diffusers` | Image-to-Video |
-| `Lightricks/LTX-Video` | Text-to-Video (with Audio), Image-to-Video (with Audio) |
+| `Lightricks/LTX-2` | Text-to-Video (with Audio), Image-to-Video (with Audio) |
 
 Models are auto-detected from the checkpoint directory. Diffusers-format models are detected via `model_index.json`; LTX-2 monolithic safetensors checkpoints are detected via embedded metadata. The `AutoPipeline` registry selects the appropriate pipeline class automatically.
 
@@ -50,9 +50,8 @@ Models are auto-detected from the checkpoint directory. Diffusers-format models
 
 Here is a simple example to generate a video with Wan 2.1:
 
-```{literalinclude} ../../../examples/visual_gen/quickstart_example.py
-    :language: python
-    :linenos:
+```bash
+python examples/visual_gen/quickstart_example.py
 ```
 
 To learn more about VisualGen, see [`examples/visual_gen/`](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/visual_gen) for more examples including text-to-image, image-to-video, and batch generation.

diff --git a/security_scanning/docs/poetry.lock b/security_scanning/docs/poetry.lock