Skip to content

Commit cb57760

Browse files
emerybergerclaude
andcommitted
feat: add Windows platform support
Add Windows port of the Mesh allocator with the following features: - Platform abstraction layer (src/platform/) for cross-platform memory operations - VirtualAlloc2/MapViewOfFile3 for page-granular meshing on Windows 10 1803+ - MSVC compiler compatibility (intrinsics, attributes, types) - Windows threading (TLS via TlsAlloc, background mesh thread) - Vectored Exception Handler for mesh write barriers - CMake build support for Windows (mesh.dll, mesh_static.lib) - Windows-specific memory stats (GetProcessMemoryInfo) - Malloc-free debug printf using vendored printf library New files: - src/platform/vmem_windows.cc - Modern Windows memory APIs - src/platform/exception_handler_windows.cc - VEH for meshing - src/runtime_windows.cc - Windows-specific runtime - src/memory_stats_windows.cc - RSS measurement - src/testing/fragmenter_windows.cc - Windows test - src/debug_printf.h, printf.c, putchar.c - Malloc-free debug output Build: cmake -B build-win && cmake --build build-win --config Release 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 757d0f9 commit cb57760

35 files changed

+3911
-101
lines changed

CLAUDE.md

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -330,8 +330,100 @@ Randomized allocation only. Useful for:
330330
### Platform Support
331331
- **Linux**: Full support (primary platform)
332332
- **macOS**: Full support
333+
- **Windows**: Experimental support (allocation only, no meshing yet)
333334
- **64-bit only**: Current implementation requires 64-bit address space
334335
336+
### Windows Port Notes
337+
338+
The Windows port required significant platform abstraction work. Key learnings:
339+
340+
#### MSVC Compiler Compatibility (src/common.h)
341+
- `__PRETTY_FUNCTION__` → `__FUNCSIG__`
342+
- `ssize_t` not available; use `SSIZE_T` from `<BaseTsd.h>`
343+
- GCC builtins need reimplementation using MSVC intrinsics:
344+
- `__builtin_popcountl` → `__popcnt64` (from `<intrin.h>`)
345+
- `__builtin_ctzl` → `_BitScanForward64`
346+
- `__builtin_clzl` → `_BitScanReverse64`
347+
- `__builtin_ffsl` → Custom implementation using `_BitScanForward64`
348+
- `__builtin_prefetch` → No-op (performance hint only)
349+
- `__builtin_unreachable` → `__assume(0)`
350+
- **Attribute placement**: MSVC's `__declspec(align(x))` and `__declspec(restrict)` must come BEFORE the type, but GCC/Clang's `__attribute__` can come after. Made these empty macros since code uses GCC-style placement.
351+
- `__restrict__` → `__restrict`
352+
- `pid_t` not available; typedef to `int`
353+
354+
#### Threading (src/thread_local_heap.h)
355+
- pthread API not available on Windows; implemented compatibility layer:
356+
- `pthread_t` → `DWORD` (thread ID)
357+
- `pthread_key_t` → `DWORD` (TLS index)
358+
- `pthread_key_create` → `TlsAlloc`
359+
- `pthread_getspecific` → `TlsGetValue`
360+
- `pthread_setspecific` → `TlsSetValue`
361+
- `pthread_self` → `GetCurrentThreadId`
362+
363+
#### Memory Management
364+
- `mmap`/`munmap` → `VirtualAlloc`/`VirtualFree` (src/platform/vmem_windows.cc)
365+
- `madvise(MADV_DONTNEED)` → `DiscardVirtualMemory` or `VirtualAlloc(MEM_RESET)`
366+
- `mprotect` → `VirtualProtect`
367+
- No `fork()` on Windows; fork-related code wrapped in `#if !defined(_WIN32)`
368+
369+
#### Locking (src/internal.h)
370+
- `PosixLockType` not available; created `WindowsLockedHeap` wrapper using `std::mutex`
371+
372+
#### Exception Handling (src/platform/exception_handler_windows.cc)
373+
- Unix signal handlers (`SIGSEGV`, `SIGBUS`) → Windows Vectored Exception Handler (VEH)
374+
- Installed via `AddVectoredExceptionHandler` in DllMain
375+
376+
#### Build System
377+
- **Use CMake for Windows builds** (not Bazel)
378+
- CMake used for cross-platform builds
379+
- Windows libraries needed: `kernel32`, `psapi`, `advapi32`
380+
- Disable CMake auto-export for DLL; use explicit `__declspec(dllexport)` via `MESH_EXPORT` macro
381+
- Build commands:
382+
```bash
383+
# Configure (from repo root)
384+
cmake -B build-win -DCMAKE_BUILD_TYPE=Release
385+
386+
# Build
387+
cmake --build build-win --config Release
388+
389+
# Test executable location
390+
build-win/bin/Release/fragmenter_test.exe
391+
```
392+
393+
#### Windows Meshing Implementation (Win10 1803+)
394+
395+
Full meshing is now supported on Windows 10 version 1803 and later using modern memory APIs:
396+
397+
**Key APIs Used**:
398+
- `VirtualAlloc2` with `MEM_RESERVE_PLACEHOLDER` - Reserves address space as placeholder
399+
- `MapViewOfFile3` with `MEM_REPLACE_PLACEHOLDER` - Maps file section into placeholder with page-granular offsets
400+
- `UnmapViewOfFile2` with `MEM_PRESERVE_PLACEHOLDER` - Unmaps while preserving placeholder
401+
402+
**How Meshing Works on Windows**:
403+
1. Arena created with `CreateFileMappingW(INVALID_HANDLE_VALUE, ...)` (pagefile-backed)
404+
2. Address space reserved as placeholder via `VirtualAlloc2`
405+
3. Initial mapping via `MapViewOfFile3` into placeholder
406+
4. During mesh: `mapSharedFixed()` remaps virtual address to different physical offset
407+
5. Two virtual addresses now share same physical memory
408+
409+
**Fallback for Older Windows**:
410+
- `hasPageGranularMapping()` returns false on pre-1803 Windows
411+
- Meshing disabled, allocation-only mode used
412+
- Legacy `MapViewOfFileEx` requires 64KB-aligned offsets (incompatible with 4KB page meshing)
413+
414+
#### Current Limitations on Windows
415+
- **No fork handling**: Windows doesn't have `fork()`
416+
- **No background mesh thread**: Mesh triggering via `epoll_wait`/`recv` interception not applicable
417+
- **Requires Win10 1803+**: Older Windows versions fall back to allocation-only mode
418+
419+
#### Key Windows-Specific Files
420+
- `src/runtime_windows.cc` - SizeMap data, measurePssKiB implementation
421+
- `src/memory_stats_windows.cc` - GetProcessMemoryInfo wrapper
422+
- `src/platform/vmem_windows.cc` - Modern API support (VirtualAlloc2, MapViewOfFile3), memory operations
423+
- `src/platform/exception_handler_windows.cc` - Vectored Exception Handler for meshing faults
424+
- `src/platform/platform.h` - Platform detection macros and FileHandle abstraction
425+
- `src/platform/vmem.h` - Cross-platform memory API declarations
426+
335427
### Integration
336428
- **Drop-in replacement**: No code changes required
337429
- **Standard API**: Full malloc/free/realloc/memalign support

CMakeLists.txt

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
1-
cmake_minimum_required(VERSION 3.13.0)
2-
3-
project(Mesh CXX C)
1+
cmake_minimum_required(VERSION 3.16)
42

3+
project(Mesh VERSION 1.0.0 LANGUAGES CXX C)
54

65
SET(CMAKE_BUILD_TYPE "" CACHE STRING "Just making sure the default CMAKE_BUILD_TYPE configurations won't interfere" FORCE)
76
set(CMAKE_C_STANDARD 11)
87
set(CMAKE_CXX_STANDARD 17)
8+
set(CMAKE_CXX_STANDARD_REQUIRED ON)
99

1010
#Set output folders
1111
set(CMAKE_OUTPUT_DIRECTORY ${CMAKE_SOURCE_DIR}/build)

Makefile

Lines changed: 33 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -18,18 +18,38 @@ else
1818
BAZEL_CONFIG =
1919
endif
2020

21-
ifeq ($(UNAME_S),Darwin)
21+
# Detect Windows (MSYS2/Git Bash/Cygwin report MINGW or MSYS or CYGWIN)
22+
ifneq (,$(findstring MINGW,$(UNAME_S)))
23+
IS_WINDOWS = 1
24+
endif
25+
ifneq (,$(findstring MSYS,$(UNAME_S)))
26+
IS_WINDOWS = 1
27+
endif
28+
ifneq (,$(findstring CYGWIN,$(UNAME_S)))
29+
IS_WINDOWS = 1
30+
endif
31+
32+
ifdef IS_WINDOWS
33+
LIB_EXT = dll
34+
STATIC_TARGET = mesh_static_windows
35+
DYNAMIC_LIB = mesh.dll
36+
LDCONFIG =
37+
# On Windows, invoke bazel via Python since shebang doesn't work in MSYS
38+
BAZEL = python ./bazel
39+
else ifeq ($(UNAME_S),Darwin)
2240
LIB_EXT = dylib
2341
STATIC_TARGET = mesh_static_macos
42+
DYNAMIC_LIB = libmesh.$(LIB_EXT)
2443
LDCONFIG =
2544
PREFIX = /usr/local
45+
BAZEL = ./bazel
2646
else
2747
LIB_EXT = so
2848
STATIC_TARGET = mesh_static_linux
49+
DYNAMIC_LIB = libmesh.$(LIB_EXT)
2950
LDCONFIG = ldconfig
51+
BAZEL = ./bazel
3052
endif
31-
32-
DYNAMIC_LIB = libmesh.$(LIB_EXT)
3353
STATIC_LIB = lib$(STATIC_TARGET).a
3454
INSTALL_DYNAMIC = libmesh$(LIB_SUFFIX).$(LIB_EXT)
3555
INSTALL_STATIC = libmesh$(LIB_SUFFIX).a
@@ -60,10 +80,10 @@ endif
6080
all: test build
6181

6282
build lib:
63-
./bazel build $(BAZEL_CONFIG) -c opt //src:$(DYNAMIC_LIB) //src:$(STATIC_TARGET)
83+
$(BAZEL) build $(BAZEL_CONFIG) -c opt //src:$(DYNAMIC_LIB) //src:$(STATIC_TARGET)
6484

6585
test check:
66-
./bazel test $(BAZEL_CONFIG) //src:unit-tests --test_output=all --action_env="GTEST_COLOR=1"
86+
$(BAZEL) test $(BAZEL_CONFIG) //src:unit-tests --test_output=all --action_env="GTEST_COLOR=1"
6787

6888
install:
6989
install -c -m 0755 bazel-bin/src/$(DYNAMIC_LIB) $(PREFIX)/lib/$(INSTALL_DYNAMIC)
@@ -81,19 +101,19 @@ clang-coverage: $(UNIT_BIN) $(CONFIG)
81101
rm -f default.profraw
82102

83103
benchmark:
84-
./bazel build $(BAZEL_CONFIG) -c opt //src:fragmenter //src:$(DYNAMIC_LIB)
104+
$(BAZEL) build $(BAZEL_CONFIG) -c opt //src:fragmenter //src:$(DYNAMIC_LIB)
85105
ifeq ($(UNAME_S),Darwin)
86106
DYLD_INSERT_LIBRARIES=./bazel-bin/src/$(DYNAMIC_LIB) ./bazel-bin/src/fragmenter
87107
else
88108
LD_PRELOAD=./bazel-bin/src/$(DYNAMIC_LIB) ./bazel-bin/src/fragmenter
89109
endif
90-
./bazel build $(BAZEL_CONFIG) --config=disable-meshing --config=nolto -c opt //src:local-refill-benchmark
110+
$(BAZEL) build $(BAZEL_CONFIG) --config=disable-meshing --config=nolto -c opt //src:local-refill-benchmark
91111
./bazel-bin/src/local-refill-benchmark
92112

93113
# Index computation benchmark - compares float reciprocal vs integer magic division
94114
# Run with: make index-benchmark
95115
index-benchmark:
96-
./bazel build $(BAZEL_CONFIG) -c opt //src:index-compute-benchmark
116+
$(BAZEL) build $(BAZEL_CONFIG) -c opt //src:index-compute-benchmark
97117
./bazel-bin/src/index-compute-benchmark
98118

99119
# Larson benchmark - multi-threaded allocation stress test
@@ -106,7 +126,7 @@ FLAMEGRAPH_DIR = third_party/FlameGraph
106126
larson: larson-nomesh
107127

108128
larson-mesh:
109-
./bazel build $(BAZEL_CONFIG) --config=nolto -c opt //src:larson-benchmark
129+
$(BAZEL) build $(BAZEL_CONFIG) --config=nolto -c opt //src:larson-benchmark
110130
ifeq ($(UNAME_S),Linux)
111131
perf record -F $(PERF_FREQ) -g --call-graph fp -o perf-larson-mesh.data -- ./bazel-bin/src/larson-benchmark $(LARSON_ARGS)
112132
perf script -i perf-larson-mesh.data | $(FLAMEGRAPH_DIR)/stackcollapse-perf.pl | $(FLAMEGRAPH_DIR)/flamegraph.pl --title "larson-mesh" > flamegraph-larson-mesh.svg
@@ -116,7 +136,7 @@ else
116136
endif
117137

118138
larson-nomesh:
119-
./bazel build $(BAZEL_CONFIG) --config=disable-meshing --config=nolto -c opt //src:larson-benchmark
139+
$(BAZEL) build $(BAZEL_CONFIG) --config=disable-meshing --config=nolto -c opt //src:larson-benchmark
120140
ifeq ($(UNAME_S),Linux)
121141
perf record -F $(PERF_FREQ) -g --call-graph fp -o perf-larson-nomesh.data -- ./bazel-bin/src/larson-benchmark $(LARSON_ARGS)
122142
perf script -i perf-larson-nomesh.data | $(FLAMEGRAPH_DIR)/stackcollapse-perf.pl | $(FLAMEGRAPH_DIR)/flamegraph.pl --title "larson-nomesh" > flamegraph-larson-nomesh.svg
@@ -130,12 +150,12 @@ format:
130150

131151
clean:
132152
find . -name '*~' -print0 | xargs -0 rm -f
133-
./bazel clean
134-
./bazel shutdown
153+
$(BAZEL) clean
154+
$(BAZEL) shutdown
135155

136156

137157
distclean: clean
138-
./bazel clean --expunge
158+
$(BAZEL) clean --expunge
139159

140160
# double $$ in egrep pattern is because we're embedding this shell command in a Makefile
141161
TAGS:

0 commit comments

Comments
 (0)