This is a Simple Webui for Generating Text using gradio.
It was influenced by Zuntan03's EasyNovelAssistant.
https://github.com/Zuntan03/EasyNovelAssistant
Installation:
I recommend setting up a virtual environment.
Install gradio and llama-cpp-python following their respective official guides.
https://github.com/gradio-app/gradio
https://github.com/abetlen/llama-cpp-python
Usage:
Run the text_generate_forever.py script in the console.
python text_generate_forever.py
Open the URL displayed.
Running on local URL: http://127.0.0.1:7860
- Go to the
Load Model / Settingstab and specify the path to load the gguf inModel Path. - You can also set options like
n_gpu_layers,n_ctx, andn_batchhere, which are the same as those used in llama-cpp-python.n_gpu_layersis the number of layers that will be placed on the GPU memory. If set to -1 to placed all layers onto the GPU memory.n_ctxis the length of the context. Longer context lengths make replies smarter, but consume more memory. You can set it to 0 to use gguf's default context length.n_batchis the batch size. Setting it to a lower number will consume less memory, but take more time to prepare for inference.
- Write a prompt in the
Generatetab and press theGenerate Foreverbutton. - Text accumulates in Buffer. Once
max_tokensare reached, the content is copied intoResultandBufferis cleared, until you press theStopbutton again.- You can also change settings like
max_tokensandtemperaturewhile generating text. (temperature affects the oddity of generated sentences). - The generated buffer is also appended to a
Log File. If you won't logging, leave it empty.
- You can also change settings like