-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
metal: Change https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
gpuAddress
for contents
Apple Metal
#16565
opened Oct 13, 2025 by
jjerphan
Loading…
extend server/public_simplechat with simple minded interactive browser-client side based toolcalling - base logic
examples
server
#16563
opened Oct 13, 2025 by
hanishkvc
Loading…
webui: introduce OpenAI-compatible model selector in JSON payload
examples
server
#16562
opened Oct 13, 2025 by
ServeurpersoCom
Loading…
embeddings: Fix --log-disable should not suppress embedding outputs
examples
#16561
opened Oct 13, 2025 by
cduk
Loading…
server : dynamic token limit for prompt cache
examples
server
#16560
opened Oct 13, 2025 by
ggerganov
Loading…
metal: optimise https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
GGML_OP_SUM
Apple Metal
#16559
opened Oct 13, 2025 by
cern1710
Loading…
CUDA: use fastdiv + ggml_cuda_mad for mmvf
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#16557
opened Oct 13, 2025 by
am17an
Loading…
Implement and use cuda graph plans
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#16548
opened Oct 13, 2025 by
wishstudio
Loading…
Add experimental ggml-hexagon backend for the Hexagon NPU
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
#16547
opened Oct 13, 2025 by
max-krasnyansky
•
Draft
CUDA: enable FA for FP32 KV cache
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#16546
opened Oct 12, 2025 by
JohannesGaessler
Loading…
vulkan: Improve build time for MSVC
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#16545
opened Oct 12, 2025 by
jeffbolznv
Loading…
tests: increase NMSE threshold for q5_1 MUL_MAT tests
testing
Everything test related
#16544
opened Oct 12, 2025 by
Erics38
Loading…
vulkan: Support FA with K/V in F32
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#16543
opened Oct 12, 2025 by
jeffbolznv
Loading…
Add https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
CONV_TRANSPOSE_2D
for Metal
Apple Metal
#16542
opened Oct 12, 2025 by
iliailmer
Loading…
1 task done
embedding: add raw option for --embd-output-format
examples
#16541
opened Oct 12, 2025 by
SamMalayek
Loading…
chat: add defensive IBM Granite Jinja compatibility (<tool_call> and <|tool_call|> support)
#16537
opened Oct 12, 2025 by
ServeurpersoCom
•
Draft
Update close-issue.yml
devops
improvements to build systems and github actions
#16535
opened Oct 12, 2025 by
barneysspeedshop
•
Draft
server: add /slots/status endpoint for secure monitoring
examples
python
python script changes
server
#16534
opened Oct 12, 2025 by
Roshankumarb31
Loading…
metal: add support for LOG op (f32, f16)
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#16530
opened Oct 12, 2025 by
RD-zhang1234
Loading…
Leverage the existing GGML_F32_VEC helpers to vectorize ggml_vec_set_f32 for faster fills
ggml
changes relating to the ggml tensor library for machine learning
#16522
opened Oct 11, 2025 by
sirus20x6
Loading…
CUDA: add fp kernel for larger batch size MoE
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#16512
opened Oct 11, 2025 by
am17an
Loading…
Previous Next
ProTip!
Filter pull requests by the default branch with base:master.