Open
Description
Name and Version
version: 4526 (a94f3b2)
built with cc (Ubuntu 12.3.0-1ubuntu1~22.04) 12.3.0 for x86_64-linux-gnu
Not sure from when this started, but before, when using llama-cli with --log-disable, I would get the response printed without other verbose info.
Now when used with --log-disabled, there is no response printed in terminal.
Example:
llama-cli -m '/mnt/disk2/LLM_MODELS/models/Phi-3.5-mini-instruct-Q5_K_M.gguf' -p "Write short joke." -ngl 99 -no-cnv --log-disable
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
Additionally, interactive mode is always on, without passing -i parameter. I need to put -no-cnv to dissable it. For example, this:
llama-cli -m '/mnt/disk2/LLM_MODELS/models/Phi-3.5-mini-instruct-Q5_K_M.gguf' -p "Write short joke." -ngl 99
puts it in interactive mode ('main: interactive mode on.' is visible in output).
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-cli
Command line
Problem description & steps to reproduce
llama-cli -m '/mnt/disk2/LLM_MODELS/models/Phi-3.5-mini-instruct-Q5_K_M.gguf' -p "Write short joke." -ngl 99 -no-cnv --log-disable
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
Omits LLM response in terminal output.
First Bad Commit
No response
Activity