MLIR elementwise kernels currently corrupt piecewise CUDA graph replay for Gemma4 E2B.
Steps to reproduce:
examples/auto_deploy/model_registry/configs/gemma4_e2b.yaml
enable mlir_elementwise_fusion
and run the model e2e.
mlir_elementwise_fusion:
enabled: true
MLIR elementwise kernels currently corrupt piecewise CUDA graph replay for Gemma4 E2B.
Steps to reproduce:
examples/auto_deploy/model_registry/configs/gemma4_e2b.yaml
enable mlir_elementwise_fusion
and run the model e2e.