Skip to content

Releases: ggml-org/llama.cpp

b4419

06 Jan 02:18
46e3556
Compare
Choose a tag to compare
CUDA: add BF16 support (#11093)

* CUDA: add BF16 support

b4418

04 Jan 20:57
b56f079
Compare
Choose a tag to compare
Vulkan: Add device-specific blacklist for coopmat for the AMD proprie…

b4417

04 Jan 20:50
9394bbd
Compare
Choose a tag to compare
llama : Add support for DeepSeek V3 (#11049)

* convert : extend DEEPSEEK2 model architecture to support DeepseekV3ForCausalLM by adding EXPERT_WEIGHTS_NORM and EXPERT_GATING_FUNC model parameters and FFN_EXP_PROBS_B tensor type

* vocab : add DeepSeek V3 pre-tokenizer regexes

* unicode : handle ACCENT_MARK and SYMBOL categories in regex

* llama : add DeepSeek V3 chat template, handle new model parameters and tensor types

---------

Co-authored-by: Stanisław Szymczyk <[email protected]>

b4416

04 Jan 16:46
f922a9c
Compare
Choose a tag to compare
[GGML][RPC] Support for models with non-512-aligned tensors over RPC.…

b4415

04 Jan 15:14
46be942
Compare
Choose a tag to compare
llama : add support for the cohere2 model architecture (#10900)

b4414

04 Jan 14:47
Compare
Choose a tag to compare
sync : ggml

b4411

04 Jan 09:39
c31fc8b
Compare
Choose a tag to compare
fix: Vulkan shader gen binary path (#11037)

b4409

03 Jan 10:16
e7da954
Compare
Choose a tag to compare
metal : avoid uint (#11019)

b4406

02 Jan 14:41
0da5d86
Compare
Choose a tag to compare
server : allow using LoRA adapters per-request (#10994)

* slot.can_batch_with

* lora per request

* test: force disable cache prompt

* move can_batch_with check

* fix condition

* add slow test with llama 8b

* update docs

* move lora change task to queue

* Apply suggestions from code review

Co-authored-by: Georgi Gerganov <[email protected]>

* lora_base

* remove redundant check

---------

Co-authored-by: Georgi Gerganov <[email protected]>

b4404

31 Dec 15:14
0827b2c
Compare
Choose a tag to compare
ggml : fixes for AVXVNNI instruction set with MSVC and Clang (#11027)

* Fixes for clang AVX VNNI

* enable AVX VNNI and alder lake build for MSVC

* Apply suggestions from code review

---------

Co-authored-by: slaren <[email protected]>