Skip to content

Releases: mostlygeek/llama-swap

v163

05 Oct 03:06
70930e4

Choose a tag to compare

This release includes two new features:

  • model macros (#330): macros can now be defined as part of a model's configuration. These take precedence over macros defined at the global level.
  • model metadata (#333): metadata can now be defined in a model's configuration. This is a schema-less object that supports integers, floats, bools, strings, arrays and child objects. metadata fields also support macro substitution. Metadata is only used in the v1/models endpoint under a new JSON key: meta.llamaswap.

Other smaller changes:

  • macro values can be any integer, string, bools, or float types. This enhancement makes JSON encoding of metadata with macros behave as expected. Previously macro values could only be strings.

Changelog

  • 70930e4 proxy: add support for user defined metadata in model configs (#333)
  • 1f61791 proxy/config: add model level macros (#330)
  • 216c40b proxy/config: create config package and migrate configuration (#329)

v162

25 Sep 23:50
9e3d491

Choose a tag to compare

Changelog

  • 9e3d491 proxyToUpstream: add redirect with trailing slash to upstream endpoint (#322)

v161

25 Sep 04:26
1a84926

Choose a tag to compare

Changelog

v160

19 Sep 18:10
fc3bb71

Choose a tag to compare

Changelog

v159

13 Sep 20:39
c36986f

Choose a tag to compare

Changelog

  • c36986f upstream handler support for model names with forward slash (#298)
  • 558801d Fix nginx proxy buffering for streaming endpoints (#295)
  • b21dee2 Fix #288 Vite hot module reloading creating multiple SSE connections (#290)

v158

06 Sep 21:03
f58c8c8

Choose a tag to compare

Changelog

  • f58c8c8 Support llama.cpp's cache_n in timings info (#287)
  • 954e2de Remove cmdStart from README [skip ci]

v157

02 Sep 04:29
a533aec

Choose a tag to compare

Changelog

v156

29 Aug 05:09
831a90d

Choose a tag to compare

Changelog

  • 831a90d Add different timeout scenarios to Process.checkHealthEndpoint #276 (#278)
  • 977f185 add /completion endpoint (#275)
  • 52b329f Fix #277 race condition in ProcessGroup.ProxyRequest when swap=true

v155

27 Aug 15:41
57803fd

Choose a tag to compare

Changelog

  • 57803fd Support llama-server's /infill endpoint (#272)
  • c55d0cc Add docs for model.concurrencyLimit #263 [skip ci]

v154

20 Aug 21:10
7acbaf4

Choose a tag to compare

Changelog