A simple chat client in single Python script
- clone this repo
 - pip3 install -U -r requirements.txt
 - copy 
demo_config.jsontoconf/config.json - get your OPENAI_API_KEY and put it in 
conf/config.json 
$ ./gptcli.py -h
usage: gptcli.py [-h] [-c CONFIG]
options:
  -h, --help  show this help message and exit
  -c CONFIG   path to your config.json (default: config.json)Sample config.json:
{
    "api_key": "sk-xxx",
    "base_url": "https://chat.pppan.net/v1",
    "model": "gemini-2.5-pro-exp-03-25",
    "context": 2,
    "stream": true,
    "stream_render": true,
    "showtokens": false,
    "proxy": "socks5://localhost:1080",
    "prompt": [
        { "role": "system", "content": "You are a helpful assistant" }
    ],
    "model_choices": [
      "gemini-2.5-pro-exp-03-25",
      "gemini-2.0-flash",
      "gemini-2.0-flash-lite",
      "gemini-1.5-flash",
      "gemini-1.5-flash-8b",
      "gemini-1.5-pro"
    ]
}- (required) api_key: OpenAI's api key. will read from OPENAI_API_KEY envronment variable if not set
 - (optional) base_url: OpenAI's api base url. Can set to a server reverse proxy, for example Azure OpenAI Service or chatgptProxyAPI. By default it's from OPENAI_API_BASE or just https://api.openai.com/v1;
 - (optional) model: LLM chat model, by default it's 
gpt-3.5-turbo; - (optional) context: Chat session context, choices are:
- 0: no context provided for every chat request, cost least tokens, but AI don't kown what you said before;
 - 1: only use previous user questions as context;
 - 2: use both previous questions and answers as context, would cost more tokens;
 
 - (optional) stream: Output in stream mode;
 - (optional) stream_render: Render markdown in stream mode, you can disable it to avoid some UI bugs;
 - (optional) showtokens: Print used tokens after every chat;
 - (optional) proxy: Use http/https/socks4a/socks5 proxy for requests to 
api_base; - (optional) prompt: Customize your prompt. This will appear in every chat request;
 - (optional) model_choices: List of available models;
 
Console help (with tab-complete):
gptcli> .help -v
gptcli commands (use '.help -v' for verbose/'.help <topic>' for details):
======================================================================================================
.edit                 Run a text editor and optionally open a file with it
.help                 List available commands or provide detailed help for a specific command
.load                 Load conversation from Markdown/JSON file
.multiline            input multiple lines, end with ctrl-d(Linux/macOS) or ctrl-z(Windows). Cancel
                      with ctrl-c
.prompt               Load different prompts
.quit                 Exit this application
.reset                Reset session, i.e. clear chat history
.save                 Save current conversation to Markdown/JSON file
.set                  Set a settable parameter or show current settings of parameters
.usage                Tokens usage of current session / last N days, or print detail billing infoRun in Docker:
# build
$ docker build -t gptcli:latest .
# run
$ docker run -it --rm -v $PWD/.key:/gptcli/.key gptcli:latest -h
# for host proxy access:
$ docker run --rm -it -v $PWD/config.json:/gptcli/config.json --network host gptcli:latest -c /gptcli/config.json- Single Python script
 - Support any OpenAI-Compatible API and models
 - Session based
 - Markdown support with code syntax highlight
 - Stream output support
 - Proxy support (HTTP/HTTPS/SOCKS4A/SOCKS5)
 -  Multiline input support (via 
.multilinecommand) -  Save and load session from file (Markdown/JSON) (via 
.saveand.loadcommand) - Print tokens usage in realtime, and tokens usage for last N days, and billing details (only works for OpenAI)
 
This script only support text models. If you want a more feature-rich client, for example, with functions like RAG, image generation, Function Calling, etc., please consult other projects, for instance, aichat.