This project provides a command-line interface (CLI) tool, ai
, that interacts with a local Llamafile language model server. It allows users to send queries to the LLM and receive streaming markdown responses directly in their terminal.
- Direct LLM Interaction: Send queries to a Llamafile-hosted language model.
- Streaming Output: Responses are streamed back and rendered progressively as markdown.
- Rich Terminal Output: Utilizes the
rich
library for formatted and readable markdown in the terminal, including syntax-highlighted code blocks. - Systemd Integration: Includes a
systemd
service file (llamaserver.service
) to manage the Llamafile server process. - Automatic Service Management: The
ai
script can check if thellamaserver.service
is running and attempt to start it if it's not. - Command-line and Interactive Modes:
- Pass a query directly as a command-line argument for a quick answer.
- Run without arguments to enter an interactive chat loop.
- Configurable Model: The Llama model (
MODEL_NAME
) and server URL (LLAMA_SERVER_URL
) can be configured within theai
script.
- Python 3.x
- The
requests
andrich
Python libraries. - A Llamafile executable (e.g.,
google_gemma-3-1b-it-Q6_K.llamafile
) and a compatible model. systemd
(if you want to use the provided service file).bash
(for theExecStart
command in the service file).
-
Clone the repository (or download the files):
git clone <your-repository-url> cd <your-repository-directory>
-
Install Python dependencies:
pip install requests rich
-
Set up Llamafile:
- Download your desired Llamafile (e.g.,
google_gemma-3-1b-it-Q6_K.llamafile
) from the official Llamafile GitHub repository or other sources. - Place the Llamafile executable in a directory, for example,
~/llamafiles/
. - Make it executable:
chmod +x ~/llamafiles/google_gemma-3-1b-it-Q6_K.llamafile
.
- Download your desired Llamafile (e.g.,
-
Configure the
ai
script:- Open the
ai
script and ensureLLAMA_SERVICE_NAME
,LLAMA_SERVER_URL
, andMODEL_NAME
are set according to your setup. By default, it's configured forgemma3-1b-it-q6k
andhttp://127.0.0.1:4141
.
- Open the
-
Configure and install the systemd service (optional but recommended):
- Edit
llamaserver.service
:- Update
WorkingDirectory
to the directory where your Llamafile executable is located (e.g.,WorkingDirectory=%h/llamafiles/
). - Update
ExecStart
to point to your Llamafile executable and desired model, port, and host. The current example is:Ensure the path to the llamafile is correct and it's executable.ExecStart=/bin/bash -c './google_gemma-3-1b-it-Q6_K.llamafile --port 4141 --host 127.0.0.1 --server'
- Update
- Copy the
llamaserver.service
file to your systemd user directory:mkdir -p ~/.config/systemd/user/ cp llamaserver.service ~/.config/systemd/user/
- Reload the systemd user daemon:
systemctl --user daemon-reload
- Enable the service to start on boot (optional):
systemctl --user enable llamaserver.service
- Start the service:
systemctl --user start llamaserver.service
- Check its status:
systemctl --user status llamaserver.service
- Edit
Make sure the ai
script is executable:
chmod +x ai
To send a single query and get a response:
./ai "Your query here"
For example:
./ai "What is the command to list files in Linux?"
To start an interactive chat session:
./ai
Then, type your queries at the >
prompt. Type exit
or quit
(or press Ctrl+D) to end the session.
To enable verbose debugging output from the ai
script, set DEBUG_MODE = True
at the top of the ai
file.
If you installed the llamaserver.service
:
- Start the service:
systemctl --user start llamaserver.service
- Stop the service:
systemctl --user stop llamaserver.service
- Check the status:
systemctl --user status llamaserver.service
- View logs:
(Add
journalctl --user -u llamaserver.service
-f
to follow logs in real-time)
The ai
script will attempt to automatically start the service if it's not detected, provided the service file is correctly installed and configured.
Contributions are welcome! Please feel free to submit pull requests or open issues for bugs, feature requests, or improvements.
- Fork the repository.
- Create a new branch (
git checkout -b feature/your-feature-name
). - Make your changes.
- Commit your changes (
git commit -am 'Add some feature'
). - Push to the branch (
git push origin feature/your-feature-name
). - Create a new Pull Request.
This project is open-source. Please specify a license if you have one (e.g., MIT, Apache 2.0). If no license is specified, it's assumed to be under a default restrictive copyright.
Consider adding a LICENSE
file to your repository.