Skip to content

[SHM] Add huge page and NUMA placement support#23697

Merged
benvanik merged 5 commits intomainfrom
users/benvanik/base-shm-hugepages
Mar 9, 2026
Merged

[SHM] Add huge page and NUMA placement support#23697
benvanik merged 5 commits intomainfrom
users/benvanik/base-shm-hugepages

Conversation

@benvanik
Copy link
Copy Markdown
Collaborator

@benvanik benvanik commented Mar 9, 2026

Unify iree_shm_options_t and iree_numa_alloc_options_t into a single placement type used by both SHM and NUMA allocation paths. SHM create functions now accept iree_numa_alloc_options_t* (NULL for defaults); open functions drop options entirely since openers map existing pages whose backing store was determined at creation time.

Platform implementations:

  • Linux: MFD_HUGETLB memfd with probe-mmap validation (MAP_POPULATE),
    MADV_HUGEPAGE for THP, mbind for NUMA. Graceful fallback cascade:
    explicit huge pages -> THP -> normal pages. Retry memfd_create
    without MFD_ALLOW_SEALING for kernel 4.14-4.15 compatibility.
  • Windows: SEC_LARGE_PAGES with silent fallback, CreateFileMappingNumaW
    for NUMA placement. Open paths try FILE_MAP_LARGE_PAGES first to
    support cross-process sharing of large-page sections.
  • macOS: No-op (no huge page or NUMA support on Apple Silicon).

Also: convert iree_numa_alloc_options_t bool fields to a flag bitfield, embed placement options in iree_async_slab_options_t, reduce IREE_SHM_MAX_NAME_LENGTH to 30 for macOS PSHMNAMLEN portability, and
improve iree_shm_seal documentation with thread-safety requirements and Windows process-local semantics.

@benvanik benvanik added the runtime Relating to the IREE runtime library label Mar 9, 2026
@benvanik benvanik requested a review from stellaraccident March 9, 2026 05:21
@benvanik benvanik added the post-merge-review Ben's special place. People can pick these up and review them for forward fixes if interested. label Mar 9, 2026
@benvanik benvanik marked this pull request as ready for review March 9, 2026 05:21
@benvanik benvanik force-pushed the users/benvanik/base-shm-hugepages branch from c3a7a68 to 28271dc Compare March 9, 2026 05:54
benvanik and others added 5 commits March 8, 2026 23:42
Unify iree_shm_options_t and iree_numa_alloc_options_t into a single
placement type used by both SHM and NUMA allocation paths. SHM create
functions now accept iree_numa_alloc_options_t* (NULL for defaults);
open functions drop options entirely since openers map existing pages
whose backing store was determined at creation time.

Platform implementations:
- Linux: MFD_HUGETLB memfd with probe-mmap validation (MAP_POPULATE),
  MADV_HUGEPAGE for THP, mbind for NUMA. Graceful fallback cascade:
  explicit huge pages -> THP -> normal pages. Retry memfd_create
  without MFD_ALLOW_SEALING for kernel 4.14-4.15 compatibility.
- Windows: SEC_LARGE_PAGES with silent fallback, CreateFileMappingNumaW
  for NUMA placement. Open paths try FILE_MAP_LARGE_PAGES first to
  support cross-process sharing of large-page sections.
- macOS: No-op (no huge page or NUMA support on Apple Silicon).

Also: convert iree_numa_alloc_options_t bool fields to a flag bitfield,
embed placement options in iree_async_slab_options_t, reduce
IREE_SHM_MAX_NAME_LENGTH to 30 for macOS PSHMNAMLEN portability, and
improve iree_shm_seal documentation with thread-safety requirements and
Windows process-local semantics.

Co-Authored-By: Claude <noreply@anthropic.com>
All shm_open() calls now include O_CLOEXEC, and all dup() calls are
replaced with fcntl(F_DUPFD_CLOEXEC, 0) which is atomic (no race
window where a concurrent fork+exec could leak the fd).

Previously, the memfd_create path correctly used MFD_CLOEXEC but the
shm_open path (macOS anonymous, named create, named open) and both
fd duplication sites (open_handle, handle_dup) left fds inheritable
across execve.

Co-Authored-By: Claude <noreply@anthropic.com>
Three fixes for Windows SHM paths found during cross-validated review:

OpenFileMappingW for named regions created with SEC_LARGE_PAGES now
tries FILE_MAP_ALL_ACCESS | FILE_MAP_LARGE_PAGES first, falling back
to plain FILE_MAP_ALL_ACCESS. Without this, named large-page regions
could not be opened by other processes.

NUMA placement is now best-effort: when CreateFileMappingNumaW fails
(invalid node, container restrictions, NUMA disabled), we fall back to
CreateFileMappingW without NUMA preference. This matches the Linux
behavior where mbind failures are silently ignored.

VirtualProtect(PAGE_READONLY) on SEC_LARGE_PAGES sections now returns
IREE_STATUS_UNAVAILABLE instead of a generic error. This uses the same
contract as macOS sealing (which is entirely unsupported), allowing
callers implementing defense-in-depth to check and proceed.

Co-Authored-By: Claude <noreply@anthropic.com>
- Windows: fall back from MapViewOfFileExNuma to MapViewOfFile when NUMA
  placement fails, completing the best-effort NUMA pattern at the view
  mapping level (creation-level fallback was already in place).
- Header: document that Windows large-page sections (SEC_LARGE_PAGES)
  do not support VirtualProtect protection changes, so sealing returns
  IREE_STATUS_UNAVAILABLE.
- Tests: add boundary tests for IREE_SHM_MAX_NAME_LENGTH (too-long
  name rejected, max-length name accepted).

Co-Authored-By: Claude <noreply@anthropic.com>
Fix zero-initialized slab_options pinning to NUMA node 0 instead of
"no preference." The refactoring to use options.placement directly
caused node_id=0 (the zero-init value) to be interpreted as "pin to
node 0" rather than the previous behavior of IREE_NUMA_NODE_ANY. On
ARM64 CI containers with restricted RLIMIT_MEMLOCK, this changed the
kernel's page-pinning behavior and caused io_uring fixed buffer
registration to fail with ENOMEM.

Also fix MultipleSendSlabRegistrations to handle the ENOMEM graceful
fallback: when RLIMIT_MEMLOCK is too low, both registrations succeed
(the proactor falls back to copy-based I/O) but neither gets real
buffer indices. The test now GTEST_SKIPs instead of asserting that
two -1 indices are distinct.

Co-Authored-By: Claude <noreply@anthropic.com>
@benvanik benvanik force-pushed the users/benvanik/base-shm-hugepages branch from 28271dc to 7236950 Compare March 9, 2026 06:42
@benvanik benvanik merged commit 7404ce9 into main Mar 9, 2026
58 of 60 checks passed
@benvanik benvanik deleted the users/benvanik/base-shm-hugepages branch March 9, 2026 07:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

post-merge-review Ben's special place. People can pick these up and review them for forward fixes if interested. runtime Relating to the IREE runtime library

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant