Skip to content

Sync master with upstream release b8933#497

Merged
jan-service-account merged 8 commits into
devfrom
update-dev-from-master-2026-04-26-01-01
Apr 26, 2026
Merged

Sync master with upstream release b8933#497
jan-service-account merged 8 commits into
devfrom
update-dev-from-master-2026-04-26-01-01

Conversation

@jan-service-account
Copy link
Copy Markdown

Updates dev branch with latest release (b8933) from ggml-org/llama.cpp

reeselevine and others added 8 commits April 25, 2026 09:18
…ggml-org#22327)

* Implement ssm_scan

* Remove blocking in graph_compute and check for set rows

* Fix bindings

* Update op support
* opt arc770 for Q4_0

* add for Q4_0

* update the script

* add help script for windows

* update guide

* fix format issue

* convert from dos to unix for format issue

* fix missed -sm parameter
* gitignore : add .pi + personal SYSTEM.md

* cont : fix requirements heading in PR template

* cont : shorten line
Change the default `ftype` in `llama_model_quantize_params` from
`LLAMA_FTYPE_MOSTLY_Q5_1` to `LLAMA_FTYPE_MOSTLY_Q8_0`.

In case some external program naively uses the default quantization
params, we should probably default to a known-good type like Q8_0 rather
than Q5_1, which is rather old.
…#20962)

* Optimize Metal Tensor API usage for matmul2d

Separates the Metal Tensor API (matmul2d) path in kernel_mul_mm into its own standalone kernel, gated by GGML_METAL_HAS_TENSOR.

The legacy simdgroup_matrix kernel is preserved under #else.

Previously both paths were interleaved via #ifdef blocks within a single kernel, forcing the tensor path to share the legacy kernel's data layout and threadgroup memory scheme. Splitting the kernel enabled memory and dispatch optimizations that weren't possible when the two paths shared code structure.

* cont : cleanup

* cont : cleanup

* cont : cleanup

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* CUDA: reduce MMQ stream-k overhead

* use 32 bit integers for kbc
* chat: fix handling of space in reasoning markers

* fix tests

* whitespace
@jan-service-account jan-service-account merged commit fa7c133 into dev Apr 26, 2026
9 checks passed
@jan-service-account jan-service-account deleted the update-dev-from-master-2026-04-26-01-01 branch April 26, 2026 01:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants