Skip to content

Conversation

@jikunshang
Copy link
Collaborator

Essential Elements of an Effective PR Description Checklist

  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.

PLEASE FILL IN THE PR DESCRIPTION HERE ENSURING ALL CHECKLIST ITEMS ABOVE HAVE BEEN CONSIDERED.

Purpose

add contiguous inside rmsnorm kernel, vllm also add this in kernel in vllm-project/vllm#28103, and remove contiguous call in python level, this may bring potential accuracy issue.

Test Plan

Test Result

(Optional) Documentation Update

BEFORE SUBMITTING, PLEASE READ https://docs.vllm.ai/en/latest/contributing (anything written below this line will be removed by GitHub Actions)

Copilot AI review requested due to automatic review settings December 12, 2025 05:12
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds input tensor contiguity checks and enforcement directly within the RMSnorm kernel implementation, aligning with a similar change made in vLLM. The change moves contiguity handling from Python to C++ level to prevent potential accuracy issues.

Key Changes:

  • Adds contiguity validation for output and weight tensors
  • Implements automatic conversion to contiguous layout for input tensor when stride is not 1
  • Adds runtime checks to ensure proper tensor layouts before kernel execution

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if (input.stride(-1) != 1) {
input = input.contiguous();
}
TORCH_CHECK(input.stride(-1) == 1);
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The check at line 211 is redundant. After the contiguous() call on line 209, the stride(-1) is guaranteed to be 1. This assertion will never fail and can be removed.

Suggested change
TORCH_CHECK(input.stride(-1) == 1);

Copilot uses AI. Check for mistakes.
Comment on lines +208 to 215
if (input.stride(-1) != 1) {
input = input.contiguous();
}
TORCH_CHECK(input.stride(-1) == 1);
TORCH_CHECK(weight.is_contiguous());
VLLM_DISPATCH_FLOATING_TYPES(
input.scalar_type(), "call_rms_norm_kernel", [&] {
vllm::call_rms_norm_kernel<scalar_t>(out, input, weight, epsilon);
Copy link

Copilot AI Dec 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modifying the input parameter silently may lead to unexpected behavior for callers. Consider either making input non-const to signal this modification is possible, or document this behavior clearly, or make a copy to avoid mutating the caller's reference.

Suggested change
if (input.stride(-1) != 1) {
input = input.contiguous();
}
TORCH_CHECK(input.stride(-1) == 1);
TORCH_CHECK(weight.is_contiguous());
VLLM_DISPATCH_FLOATING_TYPES(
input.scalar_type(), "call_rms_norm_kernel", [&] {
vllm::call_rms_norm_kernel<scalar_t>(out, input, weight, epsilon);
auto input_ = input;
if (input_.stride(-1) != 1) {
input_ = input_.contiguous();
}
TORCH_CHECK(input_.stride(-1) == 1);
TORCH_CHECK(weight.is_contiguous());
VLLM_DISPATCH_FLOATING_TYPES(
input_.scalar_type(), "call_rms_norm_kernel", [&] {
vllm::call_rms_norm_kernel<scalar_t>(out, input_, weight, epsilon);

Copilot uses AI. Check for mistakes.
@mayuyuace
Copy link
Collaborator

Cuda may meet the same problem.
If possible, rms_norm kernel can handle the situation that last demission of input is not contiguous in the future.
Anyway, this PR LGTM.

Copy link
Collaborator

@baodii baodii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jikunshang jikunshang merged commit dd579a1 into vllm-project:main Dec 16, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants