Skip to content

Illegal Instruction Caused by grid_sample Under Windows #152385

Open
@ericspod

Description

@ericspod

🐛 Describe the bug

In Windows 10, Python 3.12.9, Pytorch 2.7.0+cu118, CUDA 12.2, the following code produces an "illegal instruction" causing an immediate crash:

import torch
src = torch.rand((1, 1, 128, 64), dtype=torch.float64)
grid = torch.rand((1, 256, 256, 2), dtype=torch.float64)
dst = nn.functional.grid_sample(
    input=src.contiguous(),
    grid=grid,
    mode="bilinear",
    padding_mode="border",
    align_corners=False
)

This is specific to float64 tensors, float32 tensor format for both src and grid allow this function to execute correctly.

This issue with the current version of PyTorch is the source of CI/CD failures using Github Windows runners and seen in this PR. These tests fail with PyTorch 2.7 specifically, previous versions do not exhibit this issue.

Output from /proc/cpuinfo in case any more detail is relevant:

processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 25
model           : 33
model name      : AMD Ryzen 9 5900X 12-Core Processor
stepping        : 2
microcode       : 0xA20120A
cpu MHz         : 3700.000
cache size      : 65536 KB
physical id     : 0
siblings        : 24
core id         : 0
cpu cores       : 24
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 17
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid aperfmperf pni pclmuldq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw wdt topoext cpb hw_pstate ibrs ibpb stibp fsgsbase bmi1 avx2 smep bmi2 erms cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves clzero xsaveerptr overflow_recov succor smca
bogomips        : 7400.00
TLB size        : 2560 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 48 bits physical, 48 bits virtual
power management: ts ttp hwpstate cpb eff_freq_ro

Versions

Collecting environment information...
PyTorch version: 2.7.0+cu118
Is debug build: False
CUDA used to build PyTorch: 11.8
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 10 Pro (10.0.19045 64-bit)
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A

Python version: 3.12.9 | packaged by Anaconda, Inc. | (main, Feb 6 2025, 18:49:16) [MSC v.1929 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.19045-SP0
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3090
Nvidia driver version: 536.23
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Name: AMD Ryzen 9 5900X 12-Core Processor
Manufacturer: AuthenticAMD
Family: 107
Architecture: 9
ProcessorType: 3
DeviceID: CPU0
CurrentClockSpeed: 3701
MaxClockSpeed: 3701
L2CacheSize: 6144
L2CacheSpeed: None
Revision: 8450

Versions of relevant libraries:
[pip3] flake8==7.2.0
[pip3] flake8-bugbear==24.2.6
[pip3] flake8-comprehensions==3.16.0
[pip3] mypy==1.11.2
[pip3] mypy_extensions==1.1.0
[pip3] numpy==2.2.5
[pip3] onnx==1.17.0
[pip3] onnx_graphsurgeon==0.5.8
[pip3] pytorch-ignite==0.4.11
[pip3] torch==2.7.0+cu118
[pip3] torchio==0.20.7
[pip3] torchvision==0.22.0
[conda] numpy 2.2.5 pypi_0 pypi
[conda] pytorch-ignite 0.4.11 pypi_0 pypi
[conda] torch 2.7.0+cu118 pypi_0 pypi
[conda] torchio 0.20.7 pypi_0 pypi
[conda] torchvision 0.22.0 pypi_0 pypi

cc @ezyang @gchanan @zou3519 @kadeng @msaroufim @peterjc123 @mszhanyi @skyline75489 @nbcsm @iremyux @Blackhex @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @jerryzh168

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions