Skip to content

Commit 3d83128

Browse files
authored
feat(alias): alias llama to llama-cpp, update docs (#1448)
Signed-off-by: Ettore Di Giacinto <[email protected]>
1 parent 1c286c3 commit 3d83128

File tree

3 files changed

+15
-4
lines changed

3 files changed

+15
-4
lines changed

docs/content/model-compatibility/_index.en.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,8 @@ Besides llama based models, LocalAI is compatible also with other architectures.
5050
| `diffusers` | SD,... | no | Image generation | no | no | N/A |
5151
| `vall-e-x` | Vall-E | no | Audio generation and Voice cloning | no | no | CPU/CUDA |
5252
| `vllm` | Various GPTs and quantization formats | yes | GPT | no | no | CPU/CUDA |
53+
| `exllama2` | GPTQ | yes | GPT only | no | no | N/A |
54+
| `transformers-musicgen` | | no | Audio generation | no | no | N/A |
5355

5456
Note: any backend name listed above can be used in the `backend` field of the model configuration file (See [the advanced section]({{%relref "advanced" %}})).
5557

docs/content/model-compatibility/llama-cpp.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ weight = 1
99

1010
{{% notice note %}}
1111

12-
The `ggml` file format has been deprecated. If you are using `ggml` models and you are configuring your model with a YAML file, specify, use the `llama-stable` backend instead. If you are relying in automatic detection of the model, you should be fine. For `gguf` models, use the `llama` backend.
12+
The `ggml` file format has been deprecated. If you are using `ggml` models and you are configuring your model with a YAML file, specify, use the `llama-ggml` backend instead. If you are relying in automatic detection of the model, you should be fine. For `gguf` models, use the `llama` backend. The go backend is deprecated as well but still available as `go-llama`. The go backend supports still features not available in the mainline: speculative sampling and embeddings.
1313

1414
{{% /notice %}}
1515

@@ -65,11 +65,11 @@ parameters:
6565
6666
In the example above we specify `llama` as the backend to restrict loading `gguf` models only.
6767

68-
For instance, to use the `llama-stable` backend for `ggml` models:
68+
For instance, to use the `llama-ggml` backend for `ggml` models:
6969

7070
```yaml
7171
name: llama
72-
backend: llama-stable
72+
backend: llama-ggml
7373
parameters:
7474
# Relative to the models path
7575
model: file.ggml.bin

pkg/model/initializers.go

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,11 @@ import (
1414
"github.com/rs/zerolog/log"
1515
)
1616

17+
var Aliases map[string]string = map[string]string{
18+
"go-llama": GoLlamaBackend,
19+
"llama": LLamaCPP,
20+
}
21+
1722
const (
1823
GoLlamaBackend = "llama"
1924
LlamaGGML = "llama-ggml"
@@ -169,9 +174,13 @@ func (ml *ModelLoader) resolveAddress(addr ModelAddress, parallel bool) (*grpc.C
169174
func (ml *ModelLoader) BackendLoader(opts ...Option) (client *grpc.Client, err error) {
170175
o := NewOptions(opts...)
171176

172-
log.Debug().Msgf("Loading model %s from %s", o.backendString, o.model)
177+
log.Info().Msgf("Loading model '%s' with backend %s", o.model, o.backendString)
173178

174179
backend := strings.ToLower(o.backendString)
180+
if realBackend, exists := Aliases[backend]; exists {
181+
backend = realBackend
182+
log.Debug().Msgf("%s is an alias of %s", backend, realBackend)
183+
}
175184

176185
if o.singleActiveBackend {
177186
ml.mu.Lock()

0 commit comments

Comments
 (0)