Releases · mostlygeek/llama-swap

17 Nov 18:49

github-actions

v173

86e9b93

v173

This release includes a set of quality of life features for configuring and using llama-swap:

add JSON schema for configuration file (#393)
build, commit and version information in the UI (#395)
enable model aliases in v1/models (#400)
logTimeFormat: enable and set timestamps in log output for the proxy (#401)

Shout out to @ryan-steed-usa and @nint8835 for their contributions to this release.

Changelog

86e9b93 proxy,ui: add version endpoint and display version info in UI (#395)
3acace8 proxy: add configurable logging timestamp format (#401)
554d29e feat: enhance model listing to include aliases (#400)
3567b7d Update image in README.md for web UI section
3873852 config.example.yaml: add modeline for schema validation
c0fc858 Add configuration file JSON schema (#393)
b429349 add /ui/ to wol-proxy polling (#388)
eab2efd feat: improve llama.cpp base image tag for cpu (#391)
6aedbe1 cmd/wol-proxy: show a loading page for / (#381)
b24467a fix: update containerfile user/group management commands (#379)

Contributors

nint8835 and ryan-steed-usa

Assets 9

03 Nov 13:33

github-actions

v172

12b69fb

v172

Changelog

12b69fb proxy: recover from panic in Process.statusUpdate (#378)
f91a8b2 refactor: update Containerfile to support non-root user execution and improve security (#368)

Assets 9

29 Oct 07:12

github-actions

v171

a89b803

v171

This release includes a unique feature to show model loading progress in the Reasoning content. When enabled in the config llama-swap will stream a bit of data so there is no silence when waiting for the model to swap and load.

Add a new global config setting: sendLoadingState: true
Add a new model override setting: model.sendLoadingState: true to control it on per model basis

Demo:

llama-swap-issue-366.mp4

Thanks to @ServeurpersoCom for the very cool idea!

Changelog

a89b803 Stream loading state when swapping models (#371)

Contributors

ServeurpersoCom

Assets 9

26 Oct 03:44

github-actions

v170

f852689

v170

Fix a bug where a panic() can cause llama-swap to lock up or exit. Recommended update.

Changelog

f852689 proxy: add panic recovery to Process.ProxyRequest (#363)

Assets 9

26 Oct 00:41

github-actions

v169

e250e71

v169

This update adds usage tracking for API calls made to POST /upstream/{model}/{api}. Now, chats in the llama-server UI show up in the Activities tab. Any request to this endpoint that includes usage or timing info will appear there (infill, embeddings, etc).

Changelog

e250e71 Include metrics from upstream chat requests (#361)
d18dc26 cmd/wol-proxy: tweak logs to show what is causing wake ups (#356)

Assets 9

24 Oct 05:25

github-actions

v168

8357714

v168

Changelog

8357714 ui: fix avg token/sec calculation on models page (#357)

Averages were replaced with percentiles and a histogram:

Assets 9

21 Oct 03:57

github-actions

v167

c07179d

v167

This release adds cmd/wol-proxy, a Wake-on-LAN proxy for llama-swap. If llama-swap lives on a high idle wattage server that suspends after an idle period, wol-proxy will automatically wake that server up and then reverse proxy the requests.

A niche use case but hopefully it will save a lot of wasted energy from idle GPUs.

Changelog

c07179d cmd/wol-proxy: add wol-proxy (#352)
7ff5063 Update README for setup instructions clarity [skip ci]
9fc0431 Clean up and Documentation (#347) [skip ci]

Assets 9

16 Oct 02:35

github-actions

v166

6516532

v166

This release includes support for TLS certificates from contributor @dwrz!

To use it:

./llama-swap --tls-cert-file /path/to/cert.pem --tls-key-file /path/to/key.pem ...

Generating a self-signed certificate:

openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes

Changelog

6516532 Add optional TLS support (#340)
d58a8b8 Refactor to use httputil.ReverseProxy (#342)
caf9e98 Fix race conditions in proxy.Process (#349)

Contributors

dwrz

Assets 9

11 Oct 19:19

github-actions

v165

5392783

v165

Changelog

5392783 ui: tweak vertical space for mobile (#343)

Assets 9

07 Oct 06:00

github-actions

v164

00b738c

v164

Changelog

00b738c Add Macro-In-Macro Support (#337)

Assets 9

Releases: mostlygeek/llama-swap

v173

Changelog

Contributors

Uh oh!

v172

Changelog

Uh oh!

v171

Changelog

Contributors

Uh oh!

v170

Changelog

Uh oh!

v169

Changelog

Uh oh!

v168

Changelog

Uh oh!

v167

Changelog

Uh oh!

v166

Changelog

Contributors

Uh oh!

v165

Changelog

Uh oh!

v164

Changelog

Uh oh!