It seems that the tuner needs to catch up with IREE's logic change. The relevant PRs (as far as I know): [[Codegen][GPU] Adding new heuristics to take all dimensions into account when distributing tiles](https://github.com/iree-org/iree/pull/21803) [[VectorDistribute] Allow distributing subgroups on multiple m dimensions](https://github.com/iree-org/iree/pull/22000) [[VectorDistribute] Use subgroup_basis instead of subgroup_m/n_count](https://github.com/iree-org/iree/pull/21912)