Skip to content

Commit af3da2b

Browse files
committed
Add documentation
Introduced a new documentation structure under docs/source, covering configuration, installation (local, Docker, Helm), architecture, and common issues. Updated .env and README.md to clarify model discovery and LLM router configuration. This improves onboarding, setup, and operational guidance for developers and users.
1 parent 46af735 commit af3da2b

File tree

15 files changed

+678
-2
lines changed

15 files changed

+678
-2
lines changed

.env

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ TASK_MODEL=
5454
LLM_ROUTER_ARCH_BASE_URL=
5555

5656
## LLM Router Configuration
57-
# Path to routes policy (JSON array). Defaults to llm-router/routes.chat.json
57+
# Path to routes policy (JSON array). Required when the router is enabled; must point to a valid JSON file.
5858
LLM_ROUTER_ROUTES_PATH=
5959

6060
# Model used at the Arch router endpoint for selection

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -122,7 +122,7 @@ PUBLIC_APP_DATA_SHARING=
122122

123123
### Models
124124

125-
This build does not use the `MODELS` env var or GGUF discovery. Configure models via `OPENAI_BASE_URL` only; Chat UI will fetch `${OPENAI_BASE_URL}/models` and populate the list automatically. Authorization uses `OPENAI_API_KEY` (preferred). `HF_TOKEN` remains a legacy alias.
125+
Models are discovered from `${OPENAI_BASE_URL}/models`, and you can optionally override their metadata via the `MODELS` env var (JSON5). Legacy provider‑specific integrations and GGUF discovery are removed. Authorization uses `OPENAI_API_KEY` (preferred). `HF_TOKEN` remains a legacy alias.
126126

127127
### LLM Router (Optional)
128128

docs/source/_toctree.yml

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
- local: index
2+
title: Chat UI
3+
- title: Installation
4+
sections:
5+
- local: installation/local
6+
title: Local
7+
- local: installation/docker
8+
title: Docker
9+
- local: installation/helm
10+
title: Helm
11+
- title: Configuration
12+
sections:
13+
- local: configuration/overview
14+
title: Overview
15+
- local: configuration/theming
16+
title: Theming
17+
- local: configuration/open-id
18+
title: OpenID
19+
- local: configuration/mcp-tools
20+
title: MCP Tools
21+
- local: configuration/llm-router
22+
title: LLM Router
23+
- local: configuration/metrics
24+
title: Metrics
25+
- local: configuration/common-issues
26+
title: Common Issues
27+
- title: Developing
28+
sections:
29+
- local: developing/architecture
30+
title: Architecture
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# Common Issues
2+
3+
## 403: You don't have access to this conversation
4+
5+
This usually happens when running Chat UI over HTTP without proper cookie configuration.
6+
7+
**Recommended:** Set up a reverse proxy (NGINX, Caddy) to handle HTTPS.
8+
9+
**Alternative:** If you must run over HTTP, configure cookies:
10+
11+
```ini
12+
COOKIE_SECURE=false
13+
COOKIE_SAMESITE=lax
14+
```
15+
16+
Also ensure `PUBLIC_ORIGIN` matches your actual URL:
17+
18+
```ini
19+
PUBLIC_ORIGIN=http://localhost:5173
20+
```
21+
22+
## Models not loading
23+
24+
If models aren't appearing in the UI:
25+
26+
1. Verify `OPENAI_BASE_URL` is correct and accessible
27+
2. Check that `OPENAI_API_KEY` is valid
28+
3. Ensure the endpoint returns models at `${OPENAI_BASE_URL}/models`
29+
30+
## Database connection errors
31+
32+
For development, you can skip MongoDB entirely - Chat UI will use an embedded database.
33+
34+
For production, verify:
35+
- `MONGODB_URL` is a valid connection string
36+
- Your IP is whitelisted (for MongoDB Atlas)
37+
- The database user has read/write permissions
Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
# LLM Router
2+
3+
Chat UI includes an intelligent routing system that automatically selects the best model for each request. When enabled, users see a virtual "Omni" model that routes to specialized models based on the conversation context.
4+
5+
The router uses [katanemo/Arch-Router-1.5B](https://huggingface.co/katanemo/Arch-Router-1.5B) for route selection.
6+
7+
## Configuration
8+
9+
### Basic Setup
10+
11+
```ini
12+
# Arch router endpoint (OpenAI-compatible)
13+
LLM_ROUTER_ARCH_BASE_URL=https://router.huggingface.co/v1
14+
LLM_ROUTER_ARCH_MODEL=katanemo/Arch-Router-1.5B
15+
16+
# Path to your routes policy JSON
17+
LLM_ROUTER_ROUTES_PATH=./config/routes.json
18+
```
19+
20+
### Routes Policy
21+
22+
Create a JSON file defining your routes. Each route specifies:
23+
24+
```json
25+
[
26+
{
27+
"name": "coding",
28+
"description": "Programming, debugging, code review",
29+
"primary_model": "Qwen/Qwen3-Coder-480B-A35B-Instruct",
30+
"fallback_models": ["meta-llama/Llama-3.3-70B-Instruct"]
31+
},
32+
{
33+
"name": "casual_conversation",
34+
"description": "General chat, questions, explanations",
35+
"primary_model": "meta-llama/Llama-3.3-70B-Instruct"
36+
}
37+
]
38+
```
39+
40+
### Fallback Behavior
41+
42+
```ini
43+
# Route to use when Arch returns "other"
44+
LLM_ROUTER_OTHER_ROUTE=casual_conversation
45+
46+
# Model to use if Arch selection fails entirely
47+
LLM_ROUTER_FALLBACK_MODEL=meta-llama/Llama-3.3-70B-Instruct
48+
49+
# Selection timeout (milliseconds)
50+
LLM_ROUTER_ARCH_TIMEOUT_MS=10000
51+
```
52+
53+
## Multimodal Routing
54+
55+
When a user sends an image, the router can bypass Arch and route directly to a vision model:
56+
57+
```ini
58+
LLM_ROUTER_ENABLE_MULTIMODAL=true
59+
LLM_ROUTER_MULTIMODAL_MODEL=meta-llama/Llama-3.2-90B-Vision-Instruct
60+
```
61+
62+
## Tools Routing
63+
64+
When a user has MCP servers enabled, the router can automatically select a tools-capable model:
65+
66+
```ini
67+
LLM_ROUTER_ENABLE_TOOLS=true
68+
LLM_ROUTER_TOOLS_MODEL=meta-llama/Llama-3.3-70B-Instruct
69+
```
70+
71+
## UI Customization
72+
73+
Customize how the router appears in the model selector:
74+
75+
```ini
76+
PUBLIC_LLM_ROUTER_ALIAS_ID=omni
77+
PUBLIC_LLM_ROUTER_DISPLAY_NAME=Omni
78+
PUBLIC_LLM_ROUTER_LOGO_URL=https://example.com/logo.png
79+
```
80+
81+
## How It Works
82+
83+
When a user selects Omni:
84+
85+
1. Chat UI sends the conversation context to the Arch router
86+
2. Arch analyzes the content and returns a route name
87+
3. Chat UI maps the route to the corresponding model
88+
4. The request streams from the selected model
89+
5. On errors, fallback models are tried in order
90+
91+
The route selection is displayed in the UI so users can see which model was chosen.
92+
93+
## Message Length Limits
94+
95+
To optimize router performance, message content is trimmed before sending to Arch:
96+
97+
```ini
98+
# Max characters for assistant messages (default: 500)
99+
LLM_ROUTER_MAX_ASSISTANT_LENGTH=500
100+
101+
# Max characters for previous user messages (default: 400)
102+
LLM_ROUTER_MAX_PREV_USER_LENGTH=400
103+
```
104+
105+
The latest user message is never trimmed.
Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,83 @@
1+
# MCP Tools
2+
3+
Chat UI supports tool calling via the [Model Context Protocol (MCP)](https://modelcontextprotocol.io/). MCP servers expose tools that models can invoke during conversations.
4+
5+
## Server Types
6+
7+
Chat UI supports two types of MCP servers:
8+
9+
### Base Servers (Admin-configured)
10+
11+
Base servers are configured by the administrator via environment variables. They appear for all users and can be enabled/disabled per-user but not removed.
12+
13+
```ini
14+
MCP_SERVERS=[
15+
{"name": "Web Search (Exa)", "url": "https://mcp.exa.ai/mcp"},
16+
{"name": "Hugging Face", "url": "https://hf.co/mcp"}
17+
]
18+
```
19+
20+
Each server entry requires:
21+
- `name` - Display name shown in the UI
22+
- `url` - MCP server endpoint URL
23+
- `headers` (optional) - Custom headers for authentication
24+
25+
### User Servers (Added from UI)
26+
27+
Users can add their own MCP servers directly from the UI:
28+
29+
1. Open the chat input and click the **+** button (or go to Settings)
30+
2. Select **MCP Servers**
31+
3. Click **Add Server**
32+
4. Enter the server name and URL
33+
5. Run **Health Check** to verify connectivity
34+
35+
User-added servers are stored in the browser and can be removed at any time. They work alongside base servers.
36+
37+
## User Token Forwarding
38+
39+
When users are logged in via Hugging Face, you can forward their access token to MCP servers:
40+
41+
```ini
42+
MCP_FORWARD_HF_USER_TOKEN=true
43+
```
44+
45+
This allows MCP servers to access user-specific resources on their behalf.
46+
47+
## Using Tools
48+
49+
1. Enable the servers you want to use from the MCP Servers panel
50+
2. Start chatting - models will automatically use tools when appropriate
51+
52+
### Model Requirements
53+
54+
Not all models support tool calling. To enable tools for a specific model, add it to your `MODELS` override:
55+
56+
```ini
57+
MODELS=`[
58+
{
59+
"id": "meta-llama/Llama-3.3-70B-Instruct",
60+
"supportsTools": true
61+
}
62+
]`
63+
```
64+
65+
## Tool Execution Flow
66+
67+
When a model decides to use a tool:
68+
69+
1. The model generates a tool call with parameters
70+
2. Chat UI executes the call against the MCP server
71+
3. Results are displayed in the chat as a collapsible "tool" block
72+
4. Results are fed back to the model for follow-up responses
73+
74+
## Integration with LLM Router
75+
76+
When using the [LLM Router](./llm-router), you can configure automatic routing to a tools-capable model:
77+
78+
```ini
79+
LLM_ROUTER_ENABLE_TOOLS=true
80+
LLM_ROUTER_TOOLS_MODEL=meta-llama/Llama-3.3-70B-Instruct
81+
```
82+
83+
When a user has MCP servers enabled and selects the Omni model, the router will automatically use the specified tools model.
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# Metrics
2+
3+
The server can expose prometheus metrics on port `5565` but is off by default. You may enable the metrics server with `METRICS_ENABLED=true` and change the port with `METRICS_PORT=1234`.
4+
5+
<Tip>
6+
7+
In development with `npm run dev`, the metrics server does not shutdown gracefully due to Sveltekit not providing hooks for restart. It's recommended to disable the metrics server in this case.
8+
9+
</Tip>
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
# OpenID
2+
3+
By default, users are attributed a unique ID based on their browser session. To authenticate users with OpenID Connect, configure the following:
4+
5+
```ini
6+
OPENID_CLIENT_ID=your_client_id
7+
OPENID_CLIENT_SECRET=your_client_secret
8+
OPENID_SCOPES="openid profile"
9+
```
10+
11+
Use the provider URL for standard OpenID Connect discovery:
12+
13+
```ini
14+
OPENID_PROVIDER_URL=https://your-provider.com
15+
```
16+
17+
Advanced: you can also provide a client metadata document via `OPENID_CONFIG`. This value must be a JSON/JSON5 object (for example, a CIMD document) and is parsed server‑side to populate OpenID settings.
18+
19+
**Redirect URI:** `https://your-domain.com/login/callback`
20+
21+
## Access Control
22+
23+
Restrict access to specific users:
24+
25+
```ini
26+
# Allow only specific email addresses
27+
ALLOWED_USER_EMAILS=["[email protected]", "[email protected]"]
28+
29+
# Allow all users from specific domains
30+
ALLOWED_USER_DOMAINS=["example.com", "company.org"]
31+
```
32+
33+
## Hugging Face Login
34+
35+
For Hugging Face authentication, you can use automatic client registration:
36+
37+
```ini
38+
OPENID_CLIENT_ID=__CIMD__
39+
```
40+
41+
This creates an OAuth app automatically when deployed. See the [CIMD spec](https://datatracker.ietf.org/doc/draft-ietf-oauth-client-id-metadata-document/) for details.
42+
43+
## User Token Forwarding
44+
45+
When users log in via Hugging Face, you can forward their token for inference:
46+
47+
```ini
48+
USE_USER_TOKEN=true
49+
```
50+
51+
## Auto-Login
52+
53+
Force authentication on all routes:
54+
55+
```ini
56+
AUTOMATIC_LOGIN=true
57+
```

0 commit comments

Comments
 (0)