Skip to content

Releases: mostlygeek/llama-swap

v0.1.5

10 Dec 03:12
5fbd53c

Choose a tag to compare

Changelog

  • 5fbd53c delay TTL check until after all requests are complete (#25)

v0.1.4

09 Dec 05:39
97dae50

Choose a tag to compare

Changelog

  • 97dae50 update readme
  • cb978f7 add web interface to /logs
  • 387f0ef use new timings data in server response in run-benchmark.sh

v0.1.3

04 Dec 00:01
18c1346

Choose a tag to compare

Changelog

  • 18c1346 Add Access-Control-Allow-Origin CORS header to /v1/models endpoint
  • da2326b add example: optimizing code generation
  • da46545 fix profile example in README

v0.1.2

01 Dec 17:12
04b4760

Choose a tag to compare

!! Note breaking change in this commit if you're using profiles !!

  • model names with a profile has changed from profile/model to profile:model. The / was swapped to a :
  • Example, coding/qwen-2.5-coder-32B is now coding:qwen-2.5-coder-32B.

Changelog

v0.1.1

30 Nov 23:26
cf82b3c

Choose a tag to compare

Changelog

v0.1.0

24 Nov 03:47
73ad85e

Choose a tag to compare

Changelog

  • 73ad85e Implement Multi-Process Handling (#7)
  • 533162c add support for automatically unloading a model (#10) (#14)
  • ba39ed4 Add support for legacy v1/completions API (#12)
  • 21f54f9 Merge pull request #13 from mostlygeek/set-content-length
  • 7eec51f Dechunk HTTP requests by default (#11)
  • 5021e0f remove the process handler override
  • c9233d2 use gin instead of standard http lib in main
  • a33ac6f update README
  • 401aa88 move log handlers to separate file
  • e9e88fd rename proxy.go to proxymanager.go
  • c3b4bb1 use gin for http server
  • e5c909d add tests for proxy.Process
  • 36a31f4 add proxy.Process to manage upstream proxy logic
  • a8e5ee1 Add logging with pipes example to README

v0.0.10

10 Nov 04:22
5944a86

Choose a tag to compare

Changelog

v0.0.9

02 Nov 17:44
63d4a7d

Choose a tag to compare

Changelog

  • 63d4a7d Improve LogMonitor to handle empty writes and ensure buffer immutability

v0.0.8

01 Nov 22:29
f45469f

Choose a tag to compare

Changelog

  • f45469f Merge pull request #8 from mostlygeek/improve-upstream-monitoring-issue5
  • 34f9fd7 Improve timeout and exit handling of child processes. fix #3 and #5
  • 8448efa revise health check logic to not error on 5 second timeout

v0.0.7

31 Oct 19:23
8cf2a38

Choose a tag to compare

Changelog

  • 8cf2a38 Refactor log implementation
  • 0f133f5 Add /logs endpoint to monitor upstream processes
  • 1510b3f clean up README
  • 0f8a8e7 add header image