🐛 Bug
We are seeing mixed-precision accuracy regressions for several large models, including mixtral and llama4.
We have narrowed the regression down to the change #9663.
To Reproduce
TBD
Steps to reproduce the behavior:
Expected behavior
No regression in accuracy compared to 2.8
Environment
- Reproducible on XLA backend [CPU/TPU]: Neuron
- torch_xla version: 2.9
Additional context