When the MoE size extremely large, the Grouped GEMM will core dump. I suspect it might have exceeded the range representable by int32.