Releases: ServeurpersoCom/llama.cpp
Releases · ServeurpersoCom/llama.cpp
b6780
sycl : add ARANGE operator (#16362) * SYCL: update element-wise ops and presets * clean arange * Re-trigger CI --------- Co-authored-by: Gitty Burstein <gitty@example.com>
b6779
CANN: format code using .clang-format (#15863) This commit applies .clang-format rules to all source files under the ggml-cann directory to ensure consistent coding style and readability. The .clang-format option `SortIncludes: false` has been set to disable automatic reordering of include directives. No functional changes are introduced. Co-authored-by: hipudding <huafengchun@gmail.com>
b6778
common : Update the docs on -t --threads (#16236) * Update the docs on -t --threads * Revert "Update the docs on -t --threads" This reverts commit eba97345e2c88d8ca510abec87d00bf6b9b0e0c2. * docs: clarify -t/--threads parameter uses CPU threads and defaults to all available cores * Update arg.cpp
b6776
SYCL: Add GGML_OP_MEAN operator support (#16009) * SYCL: Add GGML_OP_MEAN operator support * SYCL: Fix formatting for GGML_OP_MEAN case * Update ggml/src/ggml-sycl/ggml-sycl.cpp Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com> --------- Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>
b6775
gguf-py : add support for endian conversion of BF16 data (#16594) BF16 requires special handling in this script while it's a 2-bytes data, but view is 1-byte by default. Switch to correct view before attempting byteswapping. With this change correctly byteswapping models like Meta-Llama-3-8B-Instruct-bf16-GGUF should be possible.
b6773
opencl: add q8_0 mm support (#16469) * opencl: add mm_q8_0_f32 * opencl: fix data loading for incomplete tile * opencl: use q8_0 mm for larger matrix * opencl: add some tests to cover the path
b6771
Add server-driven parameter defaults and syncing (#16515)
b6767
CUDA: Changing the CUDA scheduling strategy to spin (#16585) * CUDA set scheduling strategy to spinning for cc121 * Using prop.major and prop.minor, include HIP and MUSA * Exclude HIP and MUSA * Remove trailing whitespace Co-authored-by: Johannes Gäßler <johannesg@5d6.de> * Remove empty line Co-authored-by: Johannes Gäßler <johannesg@5d6.de> --------- Co-authored-by: Johannes Gäßler <johannesg@5d6.de>
b6765
metal : avoid using Metal's gpuAddress property (#16576) * metal : avoid using Metal's gpuAddress property * metal : fix rope kernels buffer check
b6764
vulkan: Add ACC_TYPE_VEC2 implementation (#16203) Signed-off-by: Stefan Savic <stefan.savic@huawei.com> Co-authored-by: Stefan Savic <stefan.savic@huawei.com>