[release/2.11] Fix Windows access violation in MIOpen CTC loss dispatch#3161
Merged
jeffdaily merged 1 commit intorelease/2.11from Apr 27, 2026
Merged
Conversation
|
Jenkins build for 2529af76adf7e0fe36ddfe12cca198d44b3e1aba commit finished as FAILURE Detected error during Pytorch building: |
…ytorch#178284) ### Summary Add the missing #include <ATen/ops/_use_miopen_ctc_loss_native.h> to LossCTC_miopen.cpp. Without this include, the _use_miopen_ctc_loss and _use_miopen_ctc_loss_tensor functions are defined without DLL linkage attributes on Windows, causing an unresolved Import Address Table (IAT) entry that crashes with an access violation (0xC0000005) at torch_hip.dll base address when CTC loss is called with CUDA tensors. ### Problem On Windows ROCm builds, calling torch.nn.functional.ctc_loss with CUDA tensors crashes with a fatal access violation: Windows fatal exception: access violation Exception Code: 0xC0000005 torch_hip.dll + 0x0 byte(s) The crash occurs in test_CTCLoss_critical_target_len and any other test that invokes ctc_loss with cudnn.flags(enabled=True) on CUDA tensors. ### Root Cause The issue is a Windows DLL linkage mismatch between the caller and the callee of at::native::_use_miopen_ctc_loss. The caller (RegisterCUDA_0.cpp, auto-generated, compiled into torch_hip.dll): The generated CUDA dispatch wrapper includes <ATen/ops/_use_miopen_ctc_loss_native.h>, which declares the function with TORCH_API. When building torch_hip.dll, TORCH_API expands to __declspec(dllimport). MSVC generates an indirect call through the Import Address Table: call [__imp_?_use_miopen_ctc_loss@native@at@@...]. The callee (LossCTC_miopen.cpp, compiled into torch_hip.dll): The implementation file does NOT include <ATen/ops/_use_miopen_ctc_loss_native.h>. The functions are defined without any DLL linkage attribute — just plain bool _use_miopen_ctc_loss(...). The compiler does not generate an __imp_ thunk for these definitions. ### The linker mismatch When linking torch_hip.dll, the linker needs to resolve the __imp_?_use_miopen_ctc_loss@native@at@@... symbol (referenced by RegisterCUDA_0.cpp.obj). This is a different symbol from ?_use_miopen_ctc_loss@native@at@@... (provided by LossCTC_miopen.cpp.obj). Since no import library (.lib) exports this function, the __imp_ IAT entry remains unresolved at RVA=0. At runtime, the indirect call jumps to DLL_base + 0x0 (the PE header), which is not executable code, causing the access violation. ### Test plan - test_CTCLoss_critical_target_len passes on Windows - test_CTCLoss_cudnn_cuda no longer crashes on Windows - Linux ROCm builds are unaffected Pull Request resolved: pytorch#178284 Approved by: https://github.com/Skylion007
Author
|
It seems that there is the same Jenkins error as on #3160. It is unrelated to the PR changes. |
jeffdaily
approved these changes
Apr 27, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Cherry pick of pytorch#178284
Fixes ROCm/TheRock#3987