When following the official Windows installation guide and running setup_env.py, the compilation finishes successfully (generating ggml.dll and llama.dll), but it fails to generate the required executable (llama-cli.exe or main.exe).
As a result, running run_inference.py throws the following error:
FileNotFoundError: [WinError 2] The system cannot find the file specified.
Root Cause:
Because llama.cpp is included as a Git submodule, CMake automatically disables the LLAMA_BUILD_EXAMPLES flag by default. Since the CLI is considered an "example" target by the llama.cpp CMake script, the executable is never built during the standard cmake --build . phase. Furthermore, the explicit build command for the CLI target is commented out in setup_env.py (around line 193).
Environment:
OS: Windows
Compiler: Clang (via Visual Studio 2022 Toolset)
Build System: CMake + MSBuild
Steps to Reproduce:
Clone the repository and update submodules.
Run python setup_env.py -md models/BitNet-b1.58-2B-4T -q i2_s
Try to run python run_inference.py -m models/BitNet-b1.58-2B-4T/ggml-model-i2_s.gguf -p "Test" -cnv
The Python subprocess fails because the .exe was never generated in the build/bin/Release folder.
Proposed Solution:
Update the Windows compilation documentation or the setup_env.py script to explicitly include the -DLLAMA_BUILD_EXAMPLES=ON flag during the CMake configuration step.