You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: apps/jan-api-gateway/README.md
+41Lines changed: 41 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,6 +11,7 @@ A comprehensive API gateway for Jan Server that provides OpenAI-compatible endpo
11
11
-**Authentication & Authorization**: JWT-based auth with Google OAuth2 integration and role-based access control
12
12
-**API Key Management**: Secure API key generation and management at organization and project levels with multiple key types (admin, project, organization, service, ephemeral)
13
13
-**Model Registry**: Dynamic model endpoint management with automatic health checking and service discovery
14
+
-**Cache Service**: High-performance caching for inference models using Redis to reduce load times and improve response performance
14
15
-**Streaming Support**: Real-time streaming responses with Server-Sent Events (SSE) and chunked transfer encoding
15
16
-**MCP Integration**: Model Context Protocol support for external tools and resources with JSON-RPC 2.0
16
17
-**Web Search**: Serper API integration for web search capabilities via MCP with webpage fetching
@@ -78,6 +79,7 @@ A comprehensive API gateway for Jan Server that provides OpenAI-compatible endpo
78
79
- Automatic migrations with Atlas
79
80
- Generated query interfaces with GORM Gen
80
81
-**Authentication**: JWT v5.3.0 + Google OAuth2 v3.15.0
82
+
-**Caching**: Redis v9.14.0 for high-performance model caching
81
83
-**API Documentation**: Swagger/OpenAPI v1.16.6
82
84
-**Streaming**: Server-Sent Events (SSE) with chunked transfer
83
85
-**Dependency Injection**: Google Wire v0.6.0
@@ -228,6 +230,45 @@ A comprehensive API gateway for Jan Server that provides OpenAI-compatible endpo
|`REDIS_PASSWORD`| Redis authentication password | `` (empty for dev) |
235
+
|`REDIS_DB`| Redis database number |`0`|
236
+
237
+
## 🚀 Redis Caching
238
+
239
+
The Jan API Gateway includes Redis caching for inference models to significantly improve performance by avoiding repeated model loading and caching identical requests.
240
+
241
+
### Redis Features
242
+
-**Model List Caching**: Cache model discovery for 10 minutes
243
+
-**Transparent Integration**: No code changes needed in existing handlers
244
+
-**Centralized Constants**: Redis cache keys defined as constants
245
+
246
+
### Quick Setup
247
+
248
+
1.**Deploy Redis Infrastructure**:
249
+
```bash
250
+
helm dependency update charts/umbrella-chart/
251
+
helm install jan-server charts/umbrella-chart/
252
+
```
253
+
254
+
2.**Environment Variables**:
255
+
```bash
256
+
REDIS_URL=redis://jan-server-redis-master:6379
257
+
REDIS_PASSWORD=""# Empty for dev
258
+
REDIS_DB=0
259
+
```
260
+
261
+
3.**Verify Setup**:
262
+
```bash
263
+
# Check Redis connectivity in logs
264
+
kubectl logs deployment/jan-server-jan-api-gateway | grep "Successfully connected to Redis"
265
+
```
266
+
267
+
### Performance Benefits
268
+
-**Reduced latency** for model discovery calls
269
+
-**Reduced CPU usage** by avoiding repeated model loading
270
+
-**Better scalability** with reduced backend load
271
+
-**Improved user experience** with faster response times
0 commit comments