-
Notifications
You must be signed in to change notification settings - Fork 54
Open
Description
Hi!
I've faced the issue when trying to use chat mode. Every time i try to chat, no matter how big or small my code base, it's always hangs for ever to Processing. Meanwhile search mode works fine.
There only one things that always follow the problem is:
llama_init_from_model: n_batch is less than GGML_KQ_MASK_PAD - increasing to 32
llama_init_from_model: n_ctx_per_seq (512) < n_ctx_train (16384) -- the full capacity of the model will not be utilized
I've created the empty project with the 1 Go-lang file with an empty class App. Search mode works fine and it's response fast and correct, but in the chat mode i can't get any response and can't understand what exactly happens.
Environment: Win11 WSL, all libs installed correctly faiss-gpu also, RTX3060, i5 12400F (works very fine with StableDiffusion)
Metadata
Metadata
Assignees
Labels
No labels