You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Update llama.cpp submodule to 7b8443ac7
* Update llama.cpp patches and BUILD.mk for 7b8443ac7
Refresh the patches against the new submodule head and add a new
ggml-backend-meta.cpp patch to annotate its callbacks with GGML_CALL,
matching the existing buffer/device/backend interface struct typedefs.
Update BUILD.mk for upstream file additions/renames: common/fit.cpp,
ggml-backend-meta.cpp, server-chat.cpp, four new mtmd models, and the
llama-iswa/t5-dec/t5-enc renames. Add a private CPPFLAGS rule so the
single-prefix build-info.cpp.o (built directly by tests) can find the
new build-info.h header. Add server-chat.cpp.o to llamafile main exe
deps.
* Add ggml-backend-meta.cpp to GPU runtime build scripts
Upstream's ggml-backend.cpp now references ggml_backend_buffer_is_meta
(line 133, 2006) and ggml-alloc.c references ggml_backend_buft_is_meta
(line 1240). Both functions are defined in the new ggml-backend-meta.cpp
which upstream made part of ggml-base.
Without this, the runtime-built GPU DSOs (ggml-cuda.so/.dll, ggml-rocm.dll,
ggml-vulkan.dll) and the on-the-fly Metal dylib build would link with
undefined references.
Updated:
- llamafile/build-functions.sh (Linux CUDA + ROCm via cuda.sh / rocm.sh)
- llamafile/cuda.bat, llamafile/cuda_parallel.bat
- llamafile/rocm.bat, llamafile/rocm_parallel.bat
- llamafile/vulkan.bat
- llamafile/metal.c (yoink + extracted-files map + compile list)
- llamafile/BUILD.mk (add ggml-backend-meta.cpp.zip.o to LLAMAFILE_METAL_SOURCES)
* Bundle ggml-cpp.h for Metal runtime compile
ggml-backend-meta.cpp #includes "ggml-cpp.h", which wasn't in the bundle
because no previously bundled source needed it. Without this, on macOS
the on-the-fly metal dylib compile fails with:
~/.llamafile/v/X.Y.Z/ggml-backend-meta.cpp:6:10: fatal error:
'ggml-cpp.h' file not found
Yoink + extract-map + LLAMAFILE_METAL_SOURCES updated.
* vulkan.sh: probe and require spirv-headers explicitly
Since llama.cpp PR #21572, ggml-vulkan.cpp #includes a SPIR-V header to
emit OpCapability/OpExtension/OpExecutionMode in compiled shaders. The
script previously only passed -I for ggml include paths, relying on the
default compiler search path. When spirv-headers isn't installed, the
build fails deep in the source with cryptic "'spv' is not a class or
namespace" errors instead of a clear missing-dependency message.
Probe the same cascade ggml-vulkan.cpp uses (plus VULKAN_SDK) and pass
the matching -I. Fail early with install instructions otherwise. Also
add spirv-headers to the glslc-not-found install hints since the two
are typically needed together.
LOG_INF("%s: fitting params to device memory, for bugs during this step try to reproduce them with -fit off, or provide --verbose logs if the bug only occurs with -fit on\n", __func__);
0 commit comments