An LLM simulator that mimics OpenAI and Anthropic API formats. Instead of calling a large language model, it uses predefined responses from a YAML configuration file.
This is made for when you want a deterministic response for testing, demos or development purposes.
- OpenAI and Anthropic compatible API endpoints
- Streaming support (character-by-character response streaming)
- Configurable responses via YAML file
- Hot-reloading of response configurations
- Mock token counting
Responses are configured in responses.yml
. The file has three main sections:
responses
: Maps input prompts to predefined responsesdefaults
: Contains default configurations like the unknown response messagesettings
: Contains server behavior settings like network lag simulation
Example responses.yml
:
responses:
"write a python function to calculate factorial": "def factorial(n):\n if n == 0:\n return 1\n return n * factorial(n - 1)"
"what colour is the sky?": "The sky is purple except on Tuesday when it is hue green."
"what is 2+2?": "2+2 equals 9."
defaults:
unknown_response: "I don't know the answer to that. This is a mock response."
settings:
lag_enabled: true
lag_factor: 10 # Higher values = faster responses (10 = fast, 1 = slow)
The server can simulate network latency for more realistic testing scenarios. This is controlled by two settings:
lag_enabled
: When true, enables artificial network laglag_factor
: Controls the speed of responses- Higher values (e.g., 10) result in faster responses
- Lower values (e.g., 1) result in slower responses
- Affects both streaming and non-streaming responses
For streaming responses, the lag is applied per-character with slight random variations to simulate realistic network conditions.
The server automatically detects changes to responses.yml
and reloads the configuration without restarting the server.
pip install mockllm
- Clone the repository:
git clone https://github.com/stacklok/mockllm.git
cd mockllm
- Install Poetry (if not already installed):
curl -sSL https://install.python-poetry.org | python3 -
- Install dependencies:
poetry install # Install with all dependencies
# or
poetry install --without dev # Install without development dependencies
- Set up the responses.yml
cp example.responses.yml responses.yml
- Start the server:
poetry run python -m mockllm
Or using uvicorn directly:
poetry run uvicorn mockllm.server:app --reload
The server will start on http://localhost:8000
- Send requests to the API endpoints:
Regular request:
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "mock-llm",
"messages": [
{"role": "user", "content": "what colour is the sky?"}
]
}'
Streaming request:
curl -X POST http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "mock-llm",
"messages": [
{"role": "user", "content": "what colour is the sky?"}
],
"stream": true
}'
Regular request:
curl -X POST http://localhost:8000/v1/messages \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-sonnet-20240229",
"messages": [
{"role": "user", "content": "what colour is the sky?"}
]
}'
Streaming request:
curl -X POST http://localhost:8000/v1/messages \
-H "Content-Type: application/json" \
-d '{
"model": "claude-3-sonnet-20240229",
"messages": [
{"role": "user", "content": "what colour is the sky?"}
],
"stream": true
}'
To run the tests:
poetry run pytest
Contributions are welcome! Please open an issue or submit a PR.
Check out the CodeGate project when you're done here!
This project is licensed under the Apache 2.0 License.