Releases: mostlygeek/llama-swap
v173
This release includes a set of quality of life features for configuring and using llama-swap:
- add JSON schema for configuration file (#393)
- build, commit and version information in the UI (#395)
- enable model aliases in
v1/models(#400) logTimeFormat: enable and set timestamps in log output for the proxy (#401)
Shout out to @ryan-steed-usa and @nint8835 for their contributions to this release.
Changelog
- 86e9b93 proxy,ui: add version endpoint and display version info in UI (#395)
- 3acace8 proxy: add configurable logging timestamp format (#401)
- 554d29e feat: enhance model listing to include aliases (#400)
- 3567b7d Update image in README.md for web UI section
- 3873852 config.example.yaml: add modeline for schema validation
- c0fc858 Add configuration file JSON schema (#393)
- b429349 add /ui/ to wol-proxy polling (#388)
- eab2efd feat: improve llama.cpp base image tag for cpu (#391)
- 6aedbe1 cmd/wol-proxy: show a loading page for / (#381)
- b24467a fix: update containerfile user/group management commands (#379)
v172
v171
This release includes a unique feature to show model loading progress in the Reasoning content. When enabled in the config llama-swap will stream a bit of data so there is no silence when waiting for the model to swap and load.
- Add a new global config setting:
sendLoadingState: true - Add a new model override setting:
model.sendLoadingState: trueto control it on per model basis
Demo:
llama-swap-issue-366.mp4
Thanks to @ServeurpersoCom for the very cool idea!
Changelog
v170
v169
This update adds usage tracking for API calls made to POST /upstream/{model}/{api}. Now, chats in the llama-server UI show up in the Activities tab. Any request to this endpoint that includes usage or timing info will appear there (infill, embeddings, etc).
Changelog
v168
v167
This release adds cmd/wol-proxy, a Wake-on-LAN proxy for llama-swap. If llama-swap lives on a high idle wattage server that suspends after an idle period, wol-proxy will automatically wake that server up and then reverse proxy the requests.
A niche use case but hopefully it will save a lot of wasted energy from idle GPUs.
Changelog
v166
This release includes support for TLS certificates from contributor @dwrz!
To use it:
./llama-swap --tls-cert-file /path/to/cert.pem --tls-key-file /path/to/key.pem ...
Generating a self-signed certificate:
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes

