Hot-swap LoRA with updated llama.cpp

### Describe the feature

### Feature

Since the PRs linked below in llama.cpp, it is possible to how-swap lora adapters. This allows to personalise NPCs and other GenAi game features by fine-tuning adapters that then can be quickly swapped on the same base model in memory.

Since llama.cpp has is being updated in #209. It would be nice to check how easily one can integrate this feature. 

### Todo list

- [x] `bin` files for adapters are now deprecated in favour of new `gguf` files, which should be ameneded in the documentation, if appearing.
- [ ] Add in documentation link to how to convert adapters to `gguf`.
- [ ] Add example on performing hot-swap
- [x] Test on using the new gguf formats should be run (I suspect the new adapter will automatically be used in hot-swapping mode)
- [x] Test using multiple adapters and hot swapping them, add code if needed, and add an example in the examples folder.


### Related links

Hot lora PRs in llama.cpp:
- Cli: https://github.com/ggerganov/llama.cpp/pull/8332
- Server: https://github.com/ggerganov/llama.cpp/pull/8857

Discord threads:
- [Discussion on lora](https://discord.com/channels/1194779009284841552/1265657308244086815/1266469723240140832)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hot-swap LoRA with updated llama.cpp #212

Describe the feature

Feature

Todo list

Related links

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Hot-swap LoRA with updated llama.cpp #212

Description

Describe the feature

Feature

Todo list

Related links

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions