Skip to content

Commit aa3096a

Browse files
committed
libtorch win-arm64: inject BSD uint typedefs for ATen aarch64 vec headers
Run 6: MAX_JOBS=2 fixed the OOM — win-arm64 compiled cleanly to [563/1448] of torch_cpu, then a real compile error: vec128_uint_aarch64.h: error: unknown type name 'uint' PyTorch's aarch64 NEON vec headers use the BSD integer typedefs (uint/ushort/ulong/ uchar) that <sys/types.h> supplies on Linux/macOS but MSVC/Windows does not — win-arm64 is a new, under-tested PyTorch target. Inject the typedefs into the ATen vec headers that use them (idempotent marker, survives the cached source tree).
1 parent 8f75d57 commit aa3096a

2 files changed

Lines changed: 16 additions & 5 deletions

File tree

TODO.md

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -80,11 +80,12 @@ Green (incl. find_package(Torch) smoke): macOS arm64, macOS x86_64, **macOS univ
8080
smoke ✅), Linux aarch64, Linux x86_64, Windows x86_64. **7/8 archives green.** Only Windows arm64
8181
left. The `if: !cancelled()` decoupling worked — universal now validated.
8282

83-
Windows arm64 (run 5): clang-cl arm64 targeting fixed — it compiled cleanly to [433/1448] of
84-
torch_cpu, then the **windows-11-arm runner was OOM-killed** ("hosted runner lost communication …
85-
starves it for CPU/Memory"; logs truncated, step stuck in_progress). Emulated x64 clang-cl on big
86-
ATen TUs at MAX_JOBS=nproc exhausted the 16 GB runner. Fix: `MAX_JOBS=2` on Windows (trade time for
87-
memory; 6h budget). If it still OOMs, drop to 1 or use native arm64 LLVM (no emulation).
83+
Windows arm64 progress (each round clears a layer): clone (longpaths) → pip (requirements-build) →
84+
clang-cl x64→arm64 target → OOM (MAX_JOBS=2) → run 6 reached [563/1448] then a real compile error:
85+
`vec128_uint_aarch64.h: unknown type name 'uint'`. PyTorch's aarch64 NEON vec headers use the BSD
86+
`uint`/`ushort`/`ulong`/`uchar` typedefs (from `<sys/types.h>` on Linux/macOS; absent on MSVC) —
87+
win-arm64 is under-tested upstream. Fix: inject those typedefs into the ATen vec headers that use
88+
them (idempotent, survives the source cache). If more non-vec TUs hit it, widen the patch scope.
8889

8990
### Still open (LibTorch)
9091
- **From-source recipes need CI iteration** (first-pass `build-libtorch.sh`):

engines/libtorch/build-libtorch.sh

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,16 @@ case "$PLATFORM" in
107107
# ~30% into torch_cpu — emulated x64 clang-cl on PyTorch's large ATen TUs is memory
108108
# -hungry. Fewer concurrent compiles trades build time (we have 6h) for not OOM-ing.
109109
export MAX_JOBS=2
110+
# PyTorch's aarch64 NEON vec headers use the BSD integer typedefs (uint/ushort/ulong/
111+
# uchar) that <sys/types.h> provides on Linux/macOS but MSVC/Windows does NOT ->
112+
# "unknown type name 'uint'" in vec128_uint_aarch64.h. win-arm64 is a new, under-tested
113+
# PyTorch target. Inject the typedefs into the ATen vec headers that use them
114+
# (idempotent via the marker, so it survives the cached source tree).
115+
while IFS= read -r h; do
116+
grep -q '__win_uint_fix__' "$h" 2>/dev/null && continue
117+
printf '// __win_uint_fix__\ntypedef unsigned int uint;\ntypedef unsigned short ushort;\ntypedef unsigned long ulong;\ntypedef unsigned char uchar;\n' \
118+
| cat - "$h" > "$h.__t" && mv "$h.__t" "$h"
119+
done < <(grep -rlwE 'uint|ushort|ulong|uchar' "$SRC/aten/src/ATen/cpu/vec" 2>/dev/null || true)
110120
;;
111121
*) echo "ERROR: unknown platform '$PLATFORM'"; exit 1 ;;
112122
esac

0 commit comments

Comments
 (0)