Khoj tries to download a model from Hugginface with this local configuration, instead of using the local API.
curl -X POST http://localhost:8080/completion \
-H "Content-Type: application/json" \
-d '{
"prompt": "Napisz puuuuu",
"max_tokens": 32,
"temperature": 0.2
}'
{"index":0,"content":"uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu","tokens":[],"id_slot":3,"stop":true,"model":"Bielik-4.5B-v3.0-Instruct.Q8_0.gguf","tokens_predicted":32,"tokens_evaluated":8,"generation_settings":{"seed":4294967295,"temperature":0.20000000298023224,"dynatemp_range":0.0,"dynatemp_exponent":1.0,"top_k":40,"top_p":0.949999988079071,"min_p":0.05000000074505806,"top_n_sigma":-1.0,"xtc_probability":0.0,"xtc_threshold":0.10000000149011612,"typical_p":1.0,"repeat_last_n":64,"repeat_penalty":1.0,"presence_penalty":0.0,"frequency_penalty":0.0,"dry_multiplier":0.0,"dry_base":1.75,"dry_allowed_length":2,"dry_penalty_last_n":4096,"dry_sequence_breakers":["\n",":","\"","*"],"mirostat":0,"mirostat_tau":5.0,"mirostat_eta":0.10000000149011612,"stop":[],"max_tokens":32,"n_predict":32,"n_keep":0,"n_discard":0,"ignore_eos":false,"stream":false,"logit_bias":[],"n_probs":0,"min_keep":0,"grammar":"","grammar_lazy":false,"grammar_triggers":[],"preserved_tokens":[],"chat_format":"Content-only","reasoning_format":"deepseek","reasoning_in_content":false,"thinking_forced_open":false,"samplers":["penalties","dry","top_n_sigma","top_k","typ_p","top_p","min_p","xtc","temperature"],"speculative.n_max":16,"speculative.n_min":0,"speculative.p_min":0.75,"timings_per_token":false,"post_sampling_probs":false,"lora":[]},"prompt":"<s>Napisz puuuuu","has_new_line":false,"truncated":false,"stop_type":"limit","stopping_word":"","tokens_cached":39,"timings":{"cache_n":1,"prompt_n":7,"prompt_ms":275.595,"prompt_per_token_ms":39.37071428571429,"prompt_per_second":25.399589978047494,"predicted_n":32,"predicted_ms":5506.741,"predicted_per_token_ms":172.08565625,"predicted_per_second":5.8110595722588005}}%
ValueError: No file found in
[server] | speakleash/Bielik-4.5B-v3.0-Instruct-
[server] | GGUF that match *Q4_K_M.gguf
If I save the same local model with a different name, without the slash /, it also doesn't work with a different error.
ValueError: not enough values to
[server] | unpack (expected 2, got 1)
FileNotFoundError:
[server] | speakleash/Bielik-4.5B-v3.0-Instruct.
[server] | Q8_0.gguf (repository not found)
Khoj should use the localhost:8080 as API instead of downloading the HF model.
Install on arch using podman compose up. Run a separate llama-cpp-server instance (I installed this through AUR) using model https://huggingface.co/speakleash/Bielik-4.5B-v3.0-Instruct-GGUF - llama-cpp-server -m local_path_model.gguf. Copy the open port and add the model to the Khoj configuration.
Server
Clients
OS
Khoj version
latest
Describe the bug
Khoj tries to download a model from Hugginface with this local configuration, instead of using the local API.
Current Behavior
This is the bielik configuration:
I used
adminAPI key because I had no clue what to put here, since it is a local self-hosted llama-cpp-server.I confirmed it works by doing:
Yet Khoj tries to download the online model from HF as evident from the logs:
If I save the same local model with a different name, without the slash /, it also doesn't work with a different error.
I tried again with this name, since it is a perfect match:
speakleash/Bielik-4.5B-v3.0-Instruct.Q8_0.ggufExpected Behavior
Khoj should use the localhost:8080 as API instead of downloading the HF model.
I followed the manual very closely:
Reproduction Steps
Install on arch using podman compose up. Run a separate llama-cpp-server instance (I installed this through AUR) using model https://huggingface.co/speakleash/Bielik-4.5B-v3.0-Instruct-GGUF -
llama-cpp-server -m local_path_model.gguf. Copy the open port and add the model to the Khoj configuration.Possible Workaround
No response
Additional Information
No response
Link to Discord or Github discussion
No response