Description
Describe the bug
Using the latest version, for some reason, I cannot use my LocalAI endpoints at all.
Having first carried over a configuration from an older version and then completely reset settings and only added my localai endpoint (either keeping or deleting the configuration for other apis), whatever I do, I keep getting:
ERROR OPENAI_API_KEY required; set the environment variable OPENAI_API_KEY or update mods.yaml through mods --settings .
I have tried:
- Actually setting an
OPENAI_API_KEY
via config/settings (it should not be required if using localai). - Completely deleting the openai api part (this was what my old config was like)
- Performing a fresh install on another machine which never had mods and only adding my localai endpoint and model and setting the default model. In this case I tried tried keeping all apis in the config, only keeping openai and localai and only keeping localai.
- Setting dummy or real env for
OPENAI_API_KEY
The behavior is the same regardless of command. If I go with mods -M
, I am able to select my model and type a prompt and am later present with that error again (see attached GIF)
Setup
- Ubuntu
- zsh or bash
- allacrity or ssh
To Reproduce
Steps to reproduce the behavior:
- Go to '...'
- Click on '....'
- Scroll down to '....'
- See error
Source Code
Config file:
# Default model (gpt-3.5-turbo, gpt-4, ggml-gpt4all-j...).
default-model: solar
# Text to append when using the -f flag.
format-text:
markdown: 'Format the response as markdown without enclosing backticks.'
json: 'Format the response as json without enclosing backticks.'
# List of predefined system messages that can be used as roles.
roles:
"default": []
# Example, a role called `shell`:
# shell:
# - you are a shell expert
# - you do not explain anything
# - you simply output one liners to solve the problems you're asked
# - you do not provide any explanation whatsoever, ONLY the command
# Ask for the response to be formatted as markdown unless otherwise set.
format: false
# System role to use.
role: "default"
# Render output as raw text when connected to a TTY.
raw: false
# Quiet mode (hide the spinner while loading and stderr messages for success).
quiet: false
# Temperature (randomness) of results, from 0.0 to 2.0.
temp: 1.0
# TopP, an alternative to temperature that narrows response, from 0.0 to 1.0.
topp: 1.0
# Turn off the client-side limit on the size of the input into the model.
no-limit: false
# Wrap formatted output at specific width (default is 80)
word-wrap: 80
# Include the prompt from the arguments in the response.
include-prompt-args: false
# Include the prompt from the arguments and stdin, truncate stdin to specified number of lines.
include-prompt: 0
# Maximum number of times to retry API calls.
max-retries: 5
# Your desired level of fanciness.
fanciness: 10
# Text to show while generating.
status-text: Generating
# Theme to use in the forms. Valid units are: 'charm', 'catppuccin', 'dracula', and 'base16'
theme: charm
# Default character limit on input to model.
max-input-chars: 12250
# Maximum number of tokens in response.
# max-tokens: 100
# Aliases and endpoints for OpenAI compatible REST API.
apis:
localai:
# LocalAI setup instructions: https://github.com/go-skynet/LocalAI#example-use-gpt4all-j-model
base-url: http://10.13.37.25:8080
models:
solar-10.7b-instruct-v1.0.Q5_K_M.gguf:
aliases: ["solar"]
max-input-chars: 12250
fallback:
Alternatively:
# Default model (gpt-3.5-turbo, gpt-4, ggml-gpt4all-j...).
default-model: solar
# Text to append when using the -f flag.
format-text:
markdown: 'Format the response as markdown without enclosing backticks.'
json: 'Format the response as json without enclosing backticks.'
# List of predefined system messages that can be used as roles.
roles:
"default": []
# Example, a role called `shell`:
# shell:
# - you are a shell expert
# - you do not explain anything
# - you simply output one liners to solve the problems you're asked
# - you do not provide any explanation whatsoever, ONLY the command
# Ask for the response to be formatted as markdown unless otherwise set.
format: false
# System role to use.
role: "default"
# Render output as raw text when connected to a TTY.
raw: false
# Quiet mode (hide the spinner while loading and stderr messages for success).
quiet: false
# Temperature (randomness) of results, from 0.0 to 2.0.
temp: 1.0
# TopP, an alternative to temperature that narrows response, from 0.0 to 1.0.
topp: 1.0
# Turn off the client-side limit on the size of the input into the model.
no-limit: false
# Wrap formatted output at specific width (default is 80)
word-wrap: 80
# Include the prompt from the arguments in the response.
include-prompt-args: false
# Include the prompt from the arguments and stdin, truncate stdin to specified number of lines.
include-prompt: 0
# Maximum number of times to retry API calls.
max-retries: 5
# Your desired level of fanciness.
fanciness: 10
# Text to show while generating.
status-text: Generating
# Theme to use in the forms. Valid units are: 'charm', 'catppuccin', 'dracula', and 'base16'
theme: charm
# Default character limit on input to model.
max-input-chars: 12250
# Maximum number of tokens in response.
# max-tokens: 100
# Aliases and endpoints for OpenAI compatible REST API.
apis:
openai:
base-url: https://api.openai.com/v1
api-key:
api-key-env: API_KEY_IS_HERE_REDACTED
models: # https://platform.openai.com/docs/models
gpt-4o-mini:
aliases: ["4o-mini"]
max-input-chars: 392000
fallback: gpt-4o
gpt-4o:
aliases: ["4o"]
max-input-chars: 392000
fallback: gpt-4
gpt-4:
aliases: ["4"]
max-input-chars: 24500
fallback: gpt-3.5-turbo
gpt-4-1106-preview:
aliases: ["128k"]
max-input-chars: 392000
fallback: gpt-4
gpt-4-32k:
aliases: ["32k"]
max-input-chars: 98000
fallback: gpt-4
gpt-3.5-turbo:
aliases: ["35t"]
max-input-chars: 12250
fallback: gpt-3.5
gpt-3.5-turbo-1106:
aliases: ["35t-1106"]
max-input-chars: 12250
fallback: gpt-3.5-turbo
gpt-3.5-turbo-16k:
aliases: ["35t16k"]
max-input-chars: 44500
fallback: gpt-3.5
gpt-3.5:
aliases: ["35"]
max-input-chars: 12250
fallback:
localai:
# LocalAI setup instructions: https://github.com/go-skynet/LocalAI#example-use-gpt4all-j-model
base-url: http://10.13.37.25:8080
models:
solar-10.7b-instruct-v1.0.Q5_K_M.gguf:
aliases: ["solar"]
max-input-chars: 12250
fallback:
Expected behavior
OPENAI_KEY should be ignored if set or not if using LocalAI as API.
Screenshots
Behavior as per the second config file:
Additional context
I'm not sure if I am missing something, but even having generated a fresh config and having looked a the code I see two issues:
- There is no case for localai (it defaults to openai) codeblock
- Even if the openai_key is set, it is not detected?