llama.app : website + unified llama binary
#23875
Replies: 8 comments 5 replies
-
|
This is terrific! Thanks to everyone who contributes to this great tool ! |
Beta Was this translation helpful? Give feedback.
-
|
Congrats on the achievement and the launch of the website! If |
Beta Was this translation helpful? Give feedback.
-
|
When choosing a model, the size is missing. In my opinion, it would be better to either remove selector and leave the links to the HG, or add a size indication. |
Beta Was this translation helpful? Give feedback.
-
|
This looks like a great step forward in terms of usability. I was wondering with the CLI, are there any plans to adjust arugments for better UX, or is the aim to just simply expose existing tools as subcommand with all existing arguments left exactly as-is? |
Beta Was this translation helpful? Give feedback.
-
|
Missed opportunity to register llama.cpp |
Beta Was this translation helpful? Give feedback.
-
|
Site looks good, two critiques:
|
Beta Was this translation helpful? Give feedback.
-
|
Congrats 👏 🎉 |
Beta Was this translation helpful? Give feedback.
-
|
Congrats! I saw a little inconsistency though: While Qwen's, Gemma's and Step's model tags show "XB MoE · YB active", GPT-OSS's model tag does not show that, making it sound more like the Qwen3.6-27B dense model. Converting it to show that it is MoE would be more accurate. Great work as always! |
Beta Was this translation helpful? Give feedback.

Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Overview
We are launching an official website for llama.cpp: https://llama.app/
The main goal of the website is to provide a simple way for new users to install and run llama.cpp on their machines. The page has an installation command (one-liner) and links/instructions to popular GGUF models on the hub. The current version is a first iteration of many.
During install the cross-platform installer ships a single binary called
llama. The binary packs all the user-facing tooling of llama.cpp (i.e.llama-server,llama-cli, etc.) with a single CLI entry point. This is mostly following thegitexample. Currently we ship binaries for the major operating systems and we plan to iterate and improve the packaging pipeline.The webpage will also provide helpful instructions for running
llamain common use cases: chat, agentic coding, etc. These will be combined with current FOTM models from various quantization providers (e.g. Unsloth, Bartowski, etc.). There will be guidelines for integration with 3rd-party agents, creating or finding the best configuration for your device and tips for utilizing advanced llama.cpp features. We will be iterating on this and would love to hear any feedback from the community on how to improve.llamaapp is here: https://github.com/ggml-org/llama.cpp/tree/master/appHF Team: @ggml-org/hf
Beta Was this translation helpful? Give feedback.
All reactions