Thanks to authors for the great work!
As shown in this commit a84d634, It just seems to add a flag reshard_for_sequence_parallel for the Row Major Linear, The forward RS and backward AG of output_ = reduce_scatter_to_sequence_parallel_region(output_parallel) have been repackaged into backward().
I'm not sure why the change of this commit would support SP, or why it wouldn't support SP if without the change.