mozilla-ai
diff --git a/‎.github/pull_request_template.md‎
Lines changed: 39 additions & 0 deletions b/‎.github/pull_request_template.md‎
Lines changed: 39 additions & 0 deletions
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 176 additions & 0 deletions b/‎CONTRIBUTING.md‎
Lines changed: 176 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 16 additions & 16 deletions b/‎README.md‎
Lines changed: 16 additions & 16 deletions
diff --git a/‎README_0.10.0.md‎
Lines changed: 1 addition & 1 deletion b/‎README_0.10.0.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/example_llamafiles.md‎
Lines changed: 14 additions & 14 deletions b/‎docs/example_llamafiles.md‎
Lines changed: 14 additions & 14 deletions
@@ -0,0 +1,39 @@
+## Description
+<!-- What does this PR do? -->
+
+
+## PR Type
+<!-- Delete the types that don't apply -->
+
+- 🆕 New Feature
+- 🐛 Bug Fix
+- 💅 Refactor
+- 📚 Documentation
+- 🚦 Infrastructure
+
+## Relevant issues
+<!-- e.g. "Fixes #123" -->
+
+## Checklist
+<!-- If this checklist is deleted from the PR submission it may be closed -->
+- [ ] I understand the code I am submitting.
+- [ ] I have run this code locally and verified the change.
+- [ ] New and existing tests pass locally, or I have explained why tests were not run.
+- [ ] Documentation was updated where necessary.
+- [ ] If I changed code in `llama.cpp/`, `whisper.cpp/`, or `stable-diffusion.cpp/`, I also updated the matching `*.patches/` files.
+- [ ] I have read and followed the [contribution guidelines](https://github.com/mozilla-ai/llamafile/blob/main/CONTRIBUTING.md).
+- [ ] **AI Usage:**
+    - [ ] No AI was used.
+    - [ ] AI was used in an assistive capacity.
+    - [ ] This PR includes substantial AI-generated content.
+
+## AI Usage Information
+<!-- Optional: if AI was used, briefly describe how -->
+
+- AI Model used:
+- AI Developer Tool used:
+- Any other info you'd like to share:
+
+When answering reviewer questions, please respond yourself rather than pasting reviewer comments into an AI system and posting the reply back unchanged.
+
+- [ ] I am an AI Agent filling out this form (check box if true)
@@ -0,0 +1,176 @@
+# Contributing to llamafile
+
+Thank you for your interest in contributing to llamafile.
+
+We welcome fixes, docs improvements, tests, build work, and larger feature work. 
+
+Submodule changes (`llama.cpp/`, `whisper.cpp/`, `stable-diffusion.cpp/`) are applied as patches rather than committed directly. If your change should also go upstream, open a PR to the upstream repository (e.g., [llama.cpp](https://github.com/ggml-org/llama.cpp)). Otherwise, follow the [submodule changes workflow](#submodule-changes) described below.
+
+## Before You Start
+
+### Check for duplicates
+
+Before starting new work:
+
+- Search [existing issues](https://github.com/mozilla-ai/llamafile/issues) for duplicates
+- Check [open pull requests](https://github.com/mozilla-ai/llamafile/pulls) to see if someone is already working on it
+- For bugs, verify the issue still exists on `main`
+
+### Discuss major changes first
+
+Please open an issue before starting larger changes such as:
+
+- new user-facing features
+- architectural changes
+- changes to public behavior or defaults
+- new dependencies
+- significant build or packaging changes
+
+This helps us stay aligned and avoids duplicate work.
+
+## Development Setup
+
+### Prerequisites
+
+You will need:
+
+- GNU `make` (called `gmake` on some systems)
+- `sha256sum` or a working `cc`
+- `wget` or `curl`
+- `unzip`
+- Git
+
+Windows contributors can use [MSYS2](https://www.msys2.org/) or WSL. See [docs/building_dlls.md](docs/building_dlls.md) for detailed Windows setup instructions.
+
+### Quick Start
+
+```sh
+# 1. Fork the repository on GitHub
+
+# 2. Clone your fork
+git clone https://github.com/YOUR_USERNAME/llamafile.git
+cd llamafile
+
+# 3. Add upstream remote
+git remote add upstream https://github.com/mozilla-ai/llamafile.git
+
+# 4. Set up submodules, patches, and toolchain
+make setup
+
+# 5. Build with cosmocc's make
+.cosmocc/4.0.2/bin/make -j8
+
+# 6. Run the default test suite
+.cosmocc/4.0.2/bin/make check
+```
+
+`make setup` initializes submodules, applies llamafile-specific patches, and downloads the `cosmocc` toolchain into `.cosmocc/`.
+
+For builds and tests, use `.cosmocc/4.0.2/bin/make`, not your system `make`.
+
+## Making Changes
+
+### 1. Create a branch
+
+Always work on a branch, not directly on `main`:
+
+```sh
+git checkout -b docs/your-change
+```
+
+Common branch prefixes:
+
+- `docs/` for documentation
+- `fix/` for bug fixes
+- `feature/` for new features
+- `build/` for build and tooling changes
+
+### 2. Make your changes
+
+There are two common workflows in this repo.
+
+#### Core code changes
+
+For changes in directories like:
+
+- `llamafile/`
+- `whisperfile/`
+- `docs/`
+- `tests/`
+
+you can edit files normally, rebuild, test, and commit as usual.
+
+#### Submodule changes
+
+The following directories are submodules:
+
+- `llama.cpp/`
+- `whisper.cpp/`
+- `stable-diffusion.cpp/`
+
+If you change code inside one of those directories, you also need to save those changes as patches in the matching `*.patches/` directory.
+
+When working inside a submodule, follow that submodule's local coding and contribution guidelines in addition to this repository's workflow.
+
+Example for `llama.cpp`:
+
+```sh
+cd llama.cpp
+../tools/generate-patches.sh --output-dir ../llama.cpp.patches
+```
+
+After generating patches, verify them from a clean state:
+
+```sh
+make reset-repo
+make setup
+.cosmocc/4.0.2/bin/make -j8
+.cosmocc/4.0.2/bin/make check
+```
+
+For a more detailed walkthrough of the patch-based workflow, see [docs/skills/llamafile/development.md](docs/skills/llamafile/development.md#making-changes-to-a-submodule).
+
+### 3. Write tests
+
+Please add or update tests whenever your change affects behavior.
+
+- New features should include tests
+- Bug fixes should include a regression test when practical
+- Docs-only changes usually do not need tests
+- Avoid mixing unrelated changes in one pull request
+
+There are also integration tests under [tests/integration/README.md](tests/integration/README.md) if you want to validate changes with a real model.
+
+### 4. Update documentation
+
+If your change affects how developers or users work with llamafile, update the relevant docs in `README.md` or `docs/`.
+
+If you add a new page to `docs/`, also add it to [`docs/SUMMARY.md`](docs/SUMMARY.md) — that file controls the GitBook navigation and is maintained by hand. CI will catch any SUMMARY entries that point to missing files, but it will not catch a new file that was never added to SUMMARY.
+
+### 5. Commit your changes
+
+Use clear commit messages:
+
+```sh
+git commit -m "Fix server startup when model path is missing"
+git commit -m "Update contributor guide for patch workflow"
+```
+
+## Submitting Changes
+
+Before opening a pull request, please make sure:
+
+- the project builds cleanly
+- the default test suite passes
+- submodule changes have been converted into patch files
+- related documentation has been updated
+- the change is focused and easy to review
+- you are ready to explain and maintain the code you changed
+
+## Useful Docs
+
+- [README.md](README.md)
+- [docs/source_installation.md](docs/source_installation.md)
+- [docs/running_llamafile.md](docs/running_llamafile.md)
+- [docs/creating_llamafiles.md](docs/creating_llamafiles.md)
+- [tests/integration/README.md](tests/integration/README.md)
@@ -21,18 +21,18 @@ framework that collapses all the complexity of LLMs down to
 a single-file executable (called a "llamafile") that runs
 locally on most operating systems and CPU archiectures, with no installation.
 
-llamafile also includes **[whisperfile](docs/whisperfile/index.md)**, a single-file speech-to-text tool built on [whisper.cpp](https://github.com/ggerganov/whisper.cpp) and the same Cosmopolitan packaging. It supports transcription and translation of audio files across all the same platforms, with no installation required.
+llamafile also includes **[whisperfile](https://docs.mozilla.ai/llamafile/whisperfile)**, a single-file speech-to-text tool built on [whisper.cpp](https://github.com/ggerganov/whisper.cpp) and the same Cosmopolitan packaging. It supports transcription and translation of audio files across all the same platforms, with no installation required.
 
 
-## v0.10.0
+## v0.10.*
 
 **llamafile versions starting from 0.10.0 use a new build system**, aimed at keeping our code more easily 
 aligned with the latest versions of llama.cpp. This means they support more recent models and functionalities,
 but at the same time they might be missing some of
 the features you were accustomed to (check out [this doc](README_0.10.0.md) for a high-level description of what has been done). If you liked
 the "classic experience" more, you will always be able to access the previous versions from our
 [releases](https://github.com/mozilla-ai/llamafile/releases) page. Our pre-built llamafiles always
-show which version of the server they have been bundled with ([0.9.* example](https://huggingface.co/mozilla-ai/llava-v1.5-7b-llamafile), [0.10.* example](https://huggingface.co/mozilla-ai/llamafile_0.10.0)), so you will always know
+show which version of the server they have been bundled with ([0.9.* example](https://huggingface.co/mozilla-ai/llava-v1.5-7b-llamafile), [0.10.* example](https://huggingface.co/mozilla-ai/llamafile_0.10)), so you will always know
 which version of the software you are downloading.
 
 
@@ -47,7 +47,7 @@ Download and run your first llamafile in minutes:
 
 ```sh
 # Download an example model (Qwen3.5 0.8B)
-curl -LO https://huggingface.co/mozilla-ai/llamafile_0.10.0/resolve/main/Qwen3.5-0.8B-Q8_0.llamafile
+curl -LO https://huggingface.co/mozilla-ai/llamafile_0.10/resolve/main/Qwen3.5-0.8B-Q8_0.llamafile
 
 # Make it executable (macOS/Linux/BSD)
 chmod +x Qwen3.5-0.8B-Q8_0.llamafile
@@ -58,25 +58,25 @@ chmod +x Qwen3.5-0.8B-Q8_0.llamafile
 
 We chose this model because that's the smallest one we have
 built a llamafile for, so most likely to work out-of-the-box for you.
-If you have powerful hardware and/or GPUs, [feel free to choose](docs/example_llamafiles.md)
+If you have powerful hardware and/or GPUs, [feel free to choose](https://docs.mozilla.ai/llamafile/getting-started/example_llamafiles)
 larger and more expressive models which should provide more accurate
 responses.
 
 **Windows users:** Rename the file to add `.exe` extension before running.
 
 ## Documentation
 
-Check the full documentation in the [docs/](docs/) folder, or directly jump into one of the following subsections:
-
-- [Quickstart](docs/quickstart.md)
-- [Example llamafiles](docs/example_llamafiles.md)
-- [Running a llamafile](docs/running_llamafile.md)
-- [Creating llamafiles](docs/creating_llamafiles.md)
-- [Source installation](docs/source_installation.md)
-- [Technical details](docs/technical_details.md)
-- [Supported Systems](docs/support.md)
-- [Troubleshooting](docs/troubleshooting.md)
-- [Whisperfile](docs/whisperfile/index.md)
+Check the full documentation at [docs.mozilla.ai/llamafile](https://docs.mozilla.ai/llamafile), or directly jump into one of the following subsections:
+
+- [Quickstart](https://docs.mozilla.ai/llamafile/getting-started/quickstart)
+- [Example llamafiles](https://docs.mozilla.ai/llamafile/getting-started/example_llamafiles)
+- [Running a llamafile](https://docs.mozilla.ai/llamafile/using-llamafile/running_llamafile)
+- [Creating llamafiles](https://docs.mozilla.ai/llamafile/using-llamafile/creating_llamafiles)
+- [Source installation](https://docs.mozilla.ai/llamafile/using-llamafile/source_installation)
+- [Technical details](https://docs.mozilla.ai/llamafile/reference/technical_details)
+- [Supported Systems](https://docs.mozilla.ai/llamafile/reference/support)
+- [Troubleshooting](https://docs.mozilla.ai/llamafile/reference/troubleshooting)
+- [Whisperfile](https://docs.mozilla.ai/llamafile/whisperfile)
 
 
 ## Licensing
 
@@ -53,7 +53,7 @@ mode) are new.
 [20251218](https://github.com/mozilla-ai/llamafile/discussions/845)
 - added Metal support: GPU on MacOS ARM64 is supported by compiling a small module
 using the Xcode Command Line Tools, which need to be installed. Check our docs at
-[docs/support.md#gpu-support](docs/support.md#gpu-support) for more info.
+[our support docs](https://docs.mozilla.ai/llamafile/reference/support#gpu-support) for more info.
 - Metal works both in llamafile (called either as TUI or with the --server flag)
 and in llama-server.
 
 
@@ -1,24 +1,24 @@
 We provide example llamafiles for a variety of models, so you can easily try out llamafile 
 with different kinds of LLMs. The following table lists llamafiles bundled with the latest
-available version of the server (v0.10.0). The smaller the file is, the more easily it will
+available version of the server (v0.10.*). The smaller the file is, the more easily it will
 run on your computer, even if no GPU is present (as a reference, Qwen3.5 0.8B Q8 generates
 text on a Raspberry Pi5 at ~8 tokens/sec).
 
 | Model | Size | License | llamafile |
 | --- | --- | --- | --- |
-| [Qwen3.5 0.8B](https://huggingface.co/Qwen/Qwen3.5-0.8B) Q8_0 | 1.6 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Qwen3.5-0.8B-Q8_0.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10.0/resolve/main/Qwen3.5-0.8B-Q8_0.llamafile) |
-| [Qwen3.5 2B](https://huggingface.co/Qwen/Qwen3.5-2B) Q8_0 | 3.2 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Qwen3.5-2B-Q8_0.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10.0/resolve/main/Qwen3.5-2B-Q8_0.llamafile) |
-| [Ministral 3 3B Instruct 2512](https://huggingface.co/mistralai/Ministral-3-3B-Instruct-2512) Q4_K_M | 3.4 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Ministral-3-3B-Instruct-2512-Q4_K_M.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10.0/resolve/main/Ministral-3-3B-Instruct-2512-Q4_K_M.llamafile) |
-| [Qwen3.5 4B](https://huggingface.co/Qwen/Qwen3.5-4B) Q5_K_S | 4.1 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Qwen3.5-4B-Q5_K_S.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10.0/resolve/main/Qwen3.5-4B-Q5_K_S.llamafile) |
-| [llava v1.6 mistral 7b](https://huggingface.co/liuhaotian/llava-v1.6-mistral-7b) Q4_K_M | 5.3 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [llava-v1.6-mistral-7b-Q4_K_M.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10.0/resolve/main/llava-v1.6-mistral-7b-Q4_K_M.llamafile) |
-| [Apertus 8B Instruct 2509](https://huggingface.co/swiss-ai/Apertus-8B-Instruct-2509) | 5.9 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Apertus-8B-Instruct-2509.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10.0/resolve/main/Apertus-8B-Instruct-2509.llamafile) |
-| [Qwen3.5 9B](https://huggingface.co/Qwen/Qwen3.5-9B) Q5_K_S | 7.4 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Qwen3.5-9B-Q5_K_S.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10.0/resolve/main/Qwen3.5-9B-Q5_K_S.llamafile) |
-| [Ministral 3 3B Instruct 2512](https://huggingface.co/mistralai/Ministral-3-3B-Instruct-2512) BF16 | 7.8 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Ministral-3-3B-Instruct-2512-BF16.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10.0/resolve/main/Ministral-3-3B-Instruct-2512-BF16.llamafile) |
-| [llava v1.6 mistral 7b](https://huggingface.co/liuhaotian/llava-v1.6-mistral-7b) Q8_0 | 8.4 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [llava-v1.6-mistral-7b-Q8_0.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10.0/resolve/main/llava-v1.6-mistral-7b-Q8_0.llamafile) |
-| [gpt-oss 20b](https://huggingface.co/openai/gpt-oss-20b) mxfp4 | 12 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [gpt-oss-20b-mxfp4.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10.0/resolve/main/gpt-oss-20b-mxfp4.llamafile) |
-| [gpt-oss 20b](https://huggingface.co/openai/gpt-oss-20b) Q5_K_S | 12 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [gpt-oss-20b-Q5_K_S.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10.0/resolve/main/gpt-oss-20b-Q5_K_S.llamafile) |
-| [LFM2 24B A2B](https://huggingface.co/LiquidAI/LFM2-24B-A2B) Q5_K_M | 16 GB | [lfm1.0](https://huggingface.co/LiquidAI/LFM2-24B-A2B/blob/main/LICENSE) | [LFM2-24B-A2B-Q5_K_M.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10.0/resolve/main/LFM2-24B-A2B-Q5_K_M.llamafile) |
-| [Qwen3.5 27B](https://huggingface.co/Qwen/Qwen3.5-27B) Q5_K_S | 19 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Qwen3.5-27B-Q5_K_S.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10.0/resolve/main/Qwen3.5-27B-Q5_K_S.llamafile) |
+| [Qwen3.5 0.8B](https://huggingface.co/Qwen/Qwen3.5-0.8B) Q8_0 | 1.6 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Qwen3.5-0.8B-Q8_0.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10/resolve/main/Qwen3.5-0.8B-Q8_0.llamafile) |
+| [Qwen3.5 2B](https://huggingface.co/Qwen/Qwen3.5-2B) Q8_0 | 3.2 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Qwen3.5-2B-Q8_0.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10/resolve/main/Qwen3.5-2B-Q8_0.llamafile) |
+| [Ministral 3 3B Instruct 2512](https://huggingface.co/mistralai/Ministral-3-3B-Instruct-2512) Q4_K_M | 3.4 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Ministral-3-3B-Instruct-2512-Q4_K_M.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10/resolve/main/Ministral-3-3B-Instruct-2512-Q4_K_M.llamafile) |
+| [Qwen3.5 4B](https://huggingface.co/Qwen/Qwen3.5-4B) Q5_K_S | 4.1 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Qwen3.5-4B-Q5_K_S.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10/resolve/main/Qwen3.5-4B-Q5_K_S.llamafile) |
+| [llava v1.6 mistral 7b](https://huggingface.co/liuhaotian/llava-v1.6-mistral-7b) Q4_K_M | 5.3 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [llava-v1.6-mistral-7b-Q4_K_M.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10/resolve/main/llava-v1.6-mistral-7b-Q4_K_M.llamafile) |
+| [Apertus 8B Instruct 2509](https://huggingface.co/swiss-ai/Apertus-8B-Instruct-2509) | 5.9 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Apertus-8B-Instruct-2509.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10/resolve/main/Apertus-8B-Instruct-2509.llamafile) |
+| [Qwen3.5 9B](https://huggingface.co/Qwen/Qwen3.5-9B) Q5_K_S | 7.4 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Qwen3.5-9B-Q5_K_S.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10/resolve/main/Qwen3.5-9B-Q5_K_S.llamafile) |
+| [Ministral 3 3B Instruct 2512](https://huggingface.co/mistralai/Ministral-3-3B-Instruct-2512) BF16 | 7.8 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Ministral-3-3B-Instruct-2512-BF16.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10/resolve/main/Ministral-3-3B-Instruct-2512-BF16.llamafile) |
+| [llava v1.6 mistral 7b](https://huggingface.co/liuhaotian/llava-v1.6-mistral-7b) Q8_0 | 8.4 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [llava-v1.6-mistral-7b-Q8_0.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10/resolve/main/llava-v1.6-mistral-7b-Q8_0.llamafile) |
+| [gpt-oss 20b](https://huggingface.co/openai/gpt-oss-20b) mxfp4 | 12 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [gpt-oss-20b-mxfp4.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10/resolve/main/gpt-oss-20b-mxfp4.llamafile) |
+| [gpt-oss 20b](https://huggingface.co/openai/gpt-oss-20b) Q5_K_S | 12 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [gpt-oss-20b-Q5_K_S.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10/resolve/main/gpt-oss-20b-Q5_K_S.llamafile) |
+| [LFM2 24B A2B](https://huggingface.co/LiquidAI/LFM2-24B-A2B) Q5_K_M | 16 GB | [lfm1.0](https://huggingface.co/LiquidAI/LFM2-24B-A2B/blob/main/LICENSE) | [LFM2-24B-A2B-Q5_K_M.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10/resolve/main/LFM2-24B-A2B-Q5_K_M.llamafile) |
+| [Qwen3.5 27B](https://huggingface.co/Qwen/Qwen3.5-27B) Q5_K_S | 19 GB | [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) | [Qwen3.5-27B-Q5_K_S.llamafile](https://huggingface.co/mozilla-ai/llamafile_0.10/resolve/main/Qwen3.5-27B-Q5_K_S.llamafile) |
 
 ## Legacy llamafiles