Releases · ggml-org/llama.cpp

06 Jan 02:18

46e3556

b4419

CUDA: add BF16 support (#11093)

* CUDA: add BF16 support

Assets 23

04 Jan 20:57

github-actions

b4418

b56f079

b4418

Vulkan: Add device-specific blacklist for coopmat for the AMD proprie…

Assets 23

04 Jan 20:50

github-actions

b4417

9394bbd

b4417

llama : Add support for DeepSeek V3 (#11049)

* convert : extend DEEPSEEK2 model architecture to support DeepseekV3ForCausalLM by adding EXPERT_WEIGHTS_NORM and EXPERT_GATING_FUNC model parameters and FFN_EXP_PROBS_B tensor type

* vocab : add DeepSeek V3 pre-tokenizer regexes

* unicode : handle ACCENT_MARK and SYMBOL categories in regex

* llama : add DeepSeek V3 chat template, handle new model parameters and tensor types

---------

Co-authored-by: Stanisław Szymczyk <[email protected]>

Assets 23

04 Jan 16:46

github-actions

b4416

f922a9c

b4416

[GGML][RPC] Support for models with non-512-aligned tensors over RPC.…

Assets 23

04 Jan 15:14

github-actions

b4415

46be942

b4415

llama : add support for the cohere2 model architecture (#10900)

Assets 23

04 Jan 14:47

github-actions

b4414

78c6785

b4414

sync : ggml

Assets 23

04 Jan 09:39

github-actions

b4411

c31fc8b

b4411

fix: Vulkan shader gen binary path (#11037)

Assets 23

03 Jan 10:16

github-actions

b4409

e7da954

b4409

metal : avoid uint (#11019)

Assets 23

02 Jan 14:41

github-actions

b4406

0da5d86

b4406

server : allow using LoRA adapters per-request (#10994)

* slot.can_batch_with

* lora per request

* test: force disable cache prompt

* move can_batch_with check

* fix condition

* add slow test with llama 8b

* update docs

* move lora change task to queue

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <[email protected]>

* lora_base

* remove redundant check

---------

Co-authored-by: Georgi Gerganov <[email protected]>

Assets 23

31 Dec 15:14

github-actions

b4404

0827b2c

b4404

ggml : fixes for AVXVNNI instruction set with MSVC and Clang (#11027)

* Fixes for clang AVX VNNI

* enable AVX VNNI and alder lake build for MSVC

* Apply suggestions from code review

---------

Co-authored-by: slaren <[email protected]>

Assets 23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Releases: ggml-org/llama.cpp

b4419

Uh oh!

b4418

Uh oh!

b4417

Uh oh!

b4416

Uh oh!

b4415

Uh oh!

b4414

Uh oh!

b4411

Uh oh!

b4409

Uh oh!

b4406

Uh oh!

b4404

Uh oh!