I moved the wiki into the repository #1473
oobabooga
announced in
Announcements
Replies: 1 comment 1 reply
-
|
Hey, i load the text-generation-webui with the following parameters: call python server.py --auto-devices --chat --character "Example" --wbits 4 --groupsize 128 --listen --no-stream --model ggml-vicuna-7b-1.1 --verbose and it does not load the character. here is an example of how it loads and the 1st question i ask (what is your name) Starting the web UI...
Gradio HTTP request redirected to localhost :)
C:\repos\oobabooga-windows-CPU\installer_files\env\lib\site-packages\bitsandbytes\cextension.py:31: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavailable.
warn("The installed version of bitsandbytes was compiled without GPU support. "
Loading ggml-vicuna-7b-1.1...
llama.cpp weights detected: models\ggml-vicuna-7b-1.1\ggml-vicuna-7b-1.0-uncensored-q4_0.bin
llama.cpp: loading model from models\ggml-vicuna-7b-1.1\ggml-vicuna-7b-1.0-uncensored-q4_0.bin
llama_model_load_internal: format = ggjt v1 (latest)
llama_model_load_internal: n_vocab = 32001
llama_model_load_internal: n_ctx = 2048
llama_model_load_internal: n_embd = 4096
llama_model_load_internal: n_mult = 256
llama_model_load_internal: n_head = 32
llama_model_load_internal: n_layer = 32
llama_model_load_internal: n_rot = 128
llama_model_load_internal: ftype = 4 (mostly Q4_1, some F16)
llama_model_load_internal: n_ff = 11008
llama_model_load_internal: n_parts = 1
llama_model_load_internal: model size = 7B
llama_model_load_internal: ggml ctx size = 59.11 KB
llama_model_load_internal: mem required = 5809.33 MB (+ 1026.00 MB per state)
llama_init_from_file: kv self size = 1024.00 MB
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 |
Loading the extension "gallery"... Ok.
Running on local URL: http://0.0.0.0:7860
To create a public link, set `share=True` in `launch()`.
A chat between a human and an assistant.
### Human: what is your name?
### Assistant:
--------------------
Output generated in 29.55 seconds (6.77 tokens/s, 200 tokens, context 27, seed 1437260233)and when i try to load a character i get this: what am i doingh wrong? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
It's located here now and contributions are welcome: https://github.com/oobabooga/text-generation-webui/tree/main/docs
Beta Was this translation helpful? Give feedback.
All reactions