Skip to content

Fix window shift in pangu, fengwu#1492

Merged
pzharrington merged 2 commits intoNVIDIA:mainfrom
pzharrington:lon-window-shift
Mar 12, 2026
Merged

Fix window shift in pangu, fengwu#1492
pzharrington merged 2 commits intoNVIDIA:mainfrom
pzharrington:lon-window-shift

Conversation

@pzharrington
Copy link
Collaborator

@pzharrington pzharrington commented Mar 11, 2026

PhysicsNeMo Pull Request

Description

Fixes a bug in pangu and fengwu attention window shift operations that was rolling along longitudes according to shift_lat instead of shift_lon. Addresses #1399

Checklist

Dependencies

Review Process

All PRs are reviewed by the PhysicsNeMo team before merging.

Depending on which files are changed, GitHub may automatically assign a maintainer for review.

We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.

AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.

@pzharrington pzharrington self-assigned this Mar 11, 2026
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 11, 2026

Greptile Summary

This PR fixes a copy-paste bug in the Swin-Transformer-based attention window shift for both Transformer3DBlock (Pangu) and Transformer2DBlock (FengWu). In the forward pass, torch.roll was rolling the longitude dimension (dims=3 / dims=2) using -shift_lat instead of -shift_lon, which only caused incorrect behavior when shift_lat != shift_lon (i.e., asymmetric grid configurations). The reverse roll (un-shift after attention) was already correct in both classes. The fix is minimal and correct.

  • Fixed Transformer3DBlock.forward: shifts=(-shift_pl, -shift_lat, -shift_lat)shifts=(-shift_pl, -shift_lat, -shift_lon)
  • Fixed Transformer2DBlock.forward: shifts=(-shift_lat, -shift_lat)shifts=(-shift_lat, -shift_lon)
  • Updated binary reference outputs (pangu_output.pth, fengwu_output.pth) to match corrected model behavior
  • Added CHANGELOG entry

Important Files Changed

Filename Overview
physicsnemo/nn/module/transformer_layers.py Fixed two bugs where the longitude roll dimension incorrectly used -shift_lat instead of -shift_lon in both Transformer3DBlock and Transformer2DBlock forward passes.
CHANGELOG.md Added changelog entry for the Pangu/FengWu attention window shift bug fix.
test/models/fengwu/data/fengwu_output.pth Binary test reference data updated to reflect corrected window shift behavior.
test/models/pangu/data/pangu_output.pth Binary test reference data updated to reflect corrected window shift behavior.

Last reviewed commit: a01d9c4

@pzharrington
Copy link
Collaborator Author

/blossom-ci

@pzharrington pzharrington enabled auto-merge March 11, 2026 23:31
@pzharrington pzharrington added this pull request to the merge queue Mar 12, 2026
Merged via the queue into NVIDIA:main with commit 219aca3 Mar 12, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants