llama_model_load fails with "failed to read magic" on Windows MSVC for models > 4GB (mmap/file I/O bug)

**Describe the bug**
When using `llama-cpp-2` (v0.1.138) on Windows with the MSVC toolchain (`x86_64-pc-windows-msvc`), loading any GGUF model larger than 4GB (e.g., a 4.5GB 7B model) fails immediately with:
`gguf_init_from_file_impl: failed to read magic`
`llama_model_load_from_file_impl: failed to load model`

**Steps to Reproduce**
1. Environment: Windows 11, Rust `x86_64-pc-windows-msvc` (64-bit).
2. Model: Any GGUF file > 4GB (e.g., `qwen2.5-coder-7b-instruct-q4_k_m.gguf` which is ~4.5GB).
3. Try to load it using `LlamaModel::load_from_file(&backend, &model_path, &model_params)`.
4. It fails.
5. Models < 2GB (e.g., 1.5B models) load perfectly fine with the exact same code and environment.
6. **Crucial baseline:** The exact same 4.5GB physical file loads perfectly in `< 1` second using `llama-cpp-python` on the exact same machine.

**Attempts to workaround**
I suspected a Windows `mmap` 4GB boundary issue, so I tried bypassing the Rust API to forcefully disable `mmap` using `unsafe`:
```rust
unsafe {
    let raw_ptr = &mut model_params as *mut _ as *mut llama_cpp_sys_2::llama_model_params;
    (*raw_ptr).use_mmap = false; 
}

```

However, the exact same `failed to read magic` error persists even with standard I/O forced.

**Root Cause Suspicion**
Since Python (`llama-cpp-python`) handles this flawlessly and 1.5B models work in Rust, it strongly implies a 32-bit integer truncation issue within the `llama-cpp-sys-2` build process on Windows MSVC. In MSVC, the C `long` type is 32-bit. It is highly likely that the `build.rs` or CMake configuration is missing large-file support macros (like `_FILE_OFFSET_BITS=64` equivalent for MSVC), causing file pointers or `mmap` offsets to overflow/truncate when addressing files larger than 4GB.

**Expected behavior**
Models > 4GB should load on Windows MSVC Rust exactly as they do in Python or Linux.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama_model_load fails with "failed to read magic" on Windows MSVC for models > 4GB (mmap/file I/O bug) #951

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

llama_model_load fails with "failed to read magic" on Windows MSVC for models > 4GB (mmap/file I/O bug) #951

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions