Sync master with upstream release b9025 by jan-service-account · Pull Request #506 · janhq/llama.cpp

jan-service-account · 2026-05-05T00:59:44Z

Updates dev branch with latest release (b9025) from ggml-org/llama.cpp

* shader(norm): add layer norm ops * shader(norm): stablize floating point computation with Kahan summation and handle mixed types * shader(norm): remove the non-contiguous strides * shader(norm): use the original implementation rather than the kahan summation

…g#22397) (ggml-org#22539) * docs : update speculative decoding parameters after refactor (ggml-org#22397) Update docs/speculative.md to reflect the new parameter naming scheme introduced in PR ggml-org#22397: - Replace --draft-max/--draft-min with --spec-draft-n-max/--spec-draft-n-min - Replace --spec-ngram-size-n/m with per-implementation variants - Add documentation for all new --spec-ngram-*- parameters - Update all example commands Assisted-by: llama.cpp:local pi * pi : add rule to use gh CLI for GitHub resources Assisted-by: llama.cpp:local pi * docs : run llama-gen-docs * arg : fix typo

…ggml-org#22004) * git-friendly migration * add build_graph * nits * exclude old code from build * wip * add llm_arch_model_i * prepare downstream functions * nits * nits * wip * wip * add back create_tensor_qkv * fix files missing include * enforce one llm_build per arch * cmake: use glob * missing model params * nits * wip * wip (2) * wip (3) * test-llama-archs is happy * improve switch case * move more stuff into llm_arch_model_i * fix downstream code * nits * nits (2) * fix order * llama_model_base * LLAMA_LOAD_LOCALS * small fix * fix build errors * auto * rm migration script and ifdef

…ml-org#22654) * chat/autoparser: the fixes * Move optspace() to chat-peg-parser, comment out server tests invalidated due to content now allowed with forced tool calls. * Trim whitespace on apply instead

…elte.ts (ggml-org#22625)

* examples: refactor diffusion generation * renamed enum values

aldehir and others added 13 commits May 4, 2026 00:18

common : determine generation prompt using longest common prefix (ggm…

e48034d

…l-org#22657)

vulkan: delete dead GGML_VK_MAX_NODES def (ggml-org#22621)

6dcd824

webui: restore missing settings (ggml-org#22666)

fa8feae

server: Add a simple get_datetime server tool (ggml-org#22649)

c84e6d6

common/autoparser: fixes for newline handling / forced tool calls (gg…

a4701c9

…ml-org#22654) * chat/autoparser: the fixes * Move optspace() to chat-peg-parser, comment out server tests invalidated due to content now allowed with forced tool calls. * Trim whitespace on apply instead

webui : fix circular dependency between chat.service.ts and models.sv…

36a694c

…elte.ts (ggml-org#22625)

examples: refactor diffusion generation (ggml-org#22590)

d8794ee

* examples: refactor diffusion generation * renamed enum values

server: implement /models?reload=1 (ggml-org#21848)

935a340

CUDA: use fastdiv for batch index split in get_rows (ggml-org#22650)

e77056f

kleidiai : update to v1.24.0 and use release archive (ggml-org#22549)

eff0670

jan-service-account merged commit 2c66a5c into dev May 5, 2026
6 checks passed

jan-service-account deleted the update-dev-from-master-2026-05-05-00-59 branch May 5, 2026 01:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync master with upstream release b9025#506

Sync master with upstream release b9025#506
jan-service-account merged 13 commits into
devfrom
update-dev-from-master-2026-05-05-00-59

jan-service-account commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants

Conversation

jan-service-account commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

13 participants