Skip to content

Sync master with upstream release b9025#506

Merged
jan-service-account merged 13 commits into
devfrom
update-dev-from-master-2026-05-05-00-59
May 5, 2026
Merged

Sync master with upstream release b9025#506
jan-service-account merged 13 commits into
devfrom
update-dev-from-master-2026-05-05-00-59

Conversation

@jan-service-account
Copy link
Copy Markdown

Updates dev branch with latest release (b9025) from ggml-org/llama.cpp

aldehir and others added 13 commits May 4, 2026 00:18
* shader(norm): add layer norm ops

* shader(norm): stablize floating point computation with Kahan summation and handle mixed types

* shader(norm): remove the non-contiguous strides

* shader(norm): use the original implementation rather than the kahan summation
…g#22397) (ggml-org#22539)

* docs : update speculative decoding parameters after refactor (ggml-org#22397)

Update docs/speculative.md to reflect the new parameter naming scheme
introduced in PR ggml-org#22397:

- Replace --draft-max/--draft-min with --spec-draft-n-max/--spec-draft-n-min
- Replace --spec-ngram-size-n/m with per-implementation variants
- Add documentation for all new --spec-ngram-*- parameters
- Update all example commands

Assisted-by: llama.cpp:local pi

* pi : add rule to use gh CLI for GitHub resources

Assisted-by: llama.cpp:local pi

* docs : run llama-gen-docs

* arg : fix typo
…ggml-org#22004)

* git-friendly migration

* add build_graph

* nits

* exclude old code from build

* wip

* add llm_arch_model_i

* prepare downstream functions

* nits

* nits

* wip

* wip

* add back create_tensor_qkv

* fix files missing include

* enforce one llm_build per arch

* cmake: use glob

* missing model params

* nits

* wip

* wip (2)

* wip (3)

* test-llama-archs is happy

* improve switch case

* move more stuff into llm_arch_model_i

* fix downstream code

* nits

* nits (2)

* fix order

* llama_model_base

* LLAMA_LOAD_LOCALS

* small fix

* fix build errors

* auto

* rm migration script and ifdef
…ml-org#22654)

* chat/autoparser: the fixes

* Move optspace() to chat-peg-parser, comment out server tests invalidated due to content now allowed with forced tool calls.

* Trim whitespace on apply instead
* examples: refactor diffusion generation

* renamed enum values
@jan-service-account jan-service-account merged commit 2c66a5c into dev May 5, 2026
6 checks passed
@jan-service-account jan-service-account deleted the update-dev-from-master-2026-05-05-00-59 branch May 5, 2026 01:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.