Closed
Description
Describe the feature
Feature
Since the PRs linked below in llama.cpp, it is possible to how-swap lora adapters. This allows to personalise NPCs and other GenAi game features by fine-tuning adapters that then can be quickly swapped on the same base model in memory.
Since llama.cpp has is being updated in #209. It would be nice to check how easily one can integrate this feature.
Todo list
-
bin
files for adapters are now deprecated in favour of newgguf
files, which should be ameneded in the documentation, if appearing. - Add in documentation link to how to convert adapters to
gguf
. - Add example on performing hot-swap
- Test on using the new gguf formats should be run (I suspect the new adapter will automatically be used in hot-swapping mode)
- Test using multiple adapters and hot swapping them, add code if needed, and add an example in the examples folder.
Related links
Hot lora PRs in llama.cpp:
- Cli: Refactor lora adapter support ggml-org/llama.cpp#8332
- Server: server : add lora hotswap endpoint ggml-org/llama.cpp#8857
Discord threads: