Skip to content

MLIR-fusion seems to corrupt CUDA graph replay in Gemma4 e2b #14272

@bmarimuthu-nv

Description

@bmarimuthu-nv

MLIR elementwise kernels currently corrupt piecewise CUDA graph replay for Gemma4 E2B.

Steps to reproduce:
examples/auto_deploy/model_registry/configs/gemma4_e2b.yaml
enable mlir_elementwise_fusion
and run the model e2e.

  mlir_elementwise_fusion:
    enabled: true

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions