Releases: mostlygeek/llama-swap
Releases · mostlygeek/llama-swap
v77
v76
v75
v0.1.5
v0.1.4
v0.1.3
v0.1.2
!! Note breaking change in this commit if you're using profiles !!
- model names with a profile has changed from
profile/model
toprofile:model
. The/
was swapped to a:
- Example,
coding/qwen-2.5-coder-32B
is nowcoding:qwen-2.5-coder-32B
.
Changelog
v0.1.1
v0.1.0
Changelog
- 73ad85e Implement Multi-Process Handling (#7)
- 533162c add support for automatically unloading a model (#10) (#14)
- ba39ed4 Add support for legacy v1/completions API (#12)
- 21f54f9 Merge pull request #13 from mostlygeek/set-content-length
- 7eec51f Dechunk HTTP requests by default (#11)
- 5021e0f remove the process handler override
- c9233d2 use gin instead of standard http lib in main
- a33ac6f update README
- 401aa88 move log handlers to separate file
- e9e88fd rename proxy.go to proxymanager.go
- c3b4bb1 use gin for http server
- e5c909d add tests for proxy.Process
- 36a31f4 add proxy.Process to manage upstream proxy logic
- a8e5ee1 Add logging with pipes example to README