Skip to content

Commit 55814f9

Browse files
Merge pull request #178 from menloresearch/feat/170/add-redis-support-models-loading
Add caching with backward compatibility for UserService and Model Registry
2 parents aa7d19e + 6051739 commit 55814f9

22 files changed

Lines changed: 760 additions & 178 deletions

File tree

apps/jan-api-gateway/LOCAL_DEV_SETUP.md

Lines changed: 30 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ jan-api-gateway/
1818
│ ├── launch.json # Debug and launch configurations
1919
│ └── tasks.json # Automated tasks (database management)
2020
├── docker/ # Docker configuration
21-
│ ├── docker-compose.yml # PostgreSQL service configuration
21+
│ ├── docker-compose.yml # PostgreSQL and Valkey cache service configuration
2222
│ └── init.sql # Database initialization script
2323
├── application/ # Go application code
2424
│ ├── cmd/server/ # Main server entry point
@@ -41,7 +41,8 @@ jan-api-gateway/
4141
1. **Press `F5`** or **Run → Start Debugging**
4242
2. **Select "Launch Jan API Gateway (Debug)"** from the dropdown
4343
3. **Wait for automatic setup:**
44-
- Database starts automatically
44+
- PostgreSQL database starts automatically
45+
- Valkey cache service starts automatically (Redis-compatible)
4546
- Environment variables are set
4647
- Application launches with debugger attached
4748

@@ -52,7 +53,7 @@ That's it! Your development environment is ready. 🎉
5253
### 1. **Launch Jan API Gateway (Debug)***Recommended*
5354
- **Purpose**: Full development environment with debugging
5455
- **What it does**:
55-
- Automatically starts PostgreSQL database
56+
- Automatically starts PostgreSQL database and Valkey cache (Redis-compatible)
5657
- Sets all required environment variables
5758
- Launches the application with debugger attached
5859
- Opens integrated terminal for logs
@@ -68,7 +69,7 @@ That's it! Your development environment is ready. 🎉
6869
### 3. **Launch Tests**
6970
- **Purpose**: Debug unit tests
7071
- **What it does**:
71-
- Starts database for testing
72+
- Starts database and Valkey cache for testing
7273
- Runs tests with debugging enabled
7374
- Allows setting breakpoints in test code
7475
- **When to use**: Debugging test failures or test logic
@@ -112,29 +113,34 @@ While the launch configurations handle the database automatically, you can also
112113
1. **Press `Ctrl+Shift+P` (Windows/Linux) or `Cmd+Shift+P` (macOS)**
113114
2. **Type "Tasks: Run Task"**
114115
3. **Select one of:**
115-
- **Start Database** - Start PostgreSQL
116-
- **Stop Database** - Stop PostgreSQL
116+
- **Start Database** - Start PostgreSQL and Valkey cache
117+
- **Stop Database** - Stop PostgreSQL and Valkey cache
117118
- **Wait for Database** - Check if database is ready
119+
- **Wait for Cache** - Check if Valkey cache is ready
118120
- **Build Application** - Build the Go application
119121
- **Run Tests** - Run all tests
120122

121123
### Using Terminal
122124
```bash
123-
# Start database
124-
docker-compose -f docker/docker-compose.yml up -d postgres
125+
# Start database and Valkey cache (primary cache service)
126+
docker-compose -f docker/docker-compose.yml up -d postgres valkey
125127

126-
# Stop database
128+
# Stop all services
127129
docker-compose -f docker/docker-compose.yml down
128130

129-
# Reset database (removes all data)
131+
# Reset database and Valkey cache (removes all data)
130132
docker-compose -f docker/docker-compose.yml down -v
131-
docker-compose -f docker/docker-compose.yml up -d postgres
133+
docker-compose -f docker/docker-compose.yml up -d postgres valkey
132134

133135
# View logs
134136
docker-compose -f docker/docker-compose.yml logs postgres
137+
docker-compose -f docker/docker-compose.yml logs valkey
135138

136139
# Connect to database
137140
docker-compose -f docker/docker-compose.yml exec postgres psql -U jan_user -d jan_api_gateway
141+
142+
# Connect to Valkey cache
143+
docker-compose -f docker/docker-compose.yml exec valkey valkey-cli
138144
```
139145

140146
## ⚙️ Environment Variables
@@ -153,16 +159,26 @@ The following environment variables are **automatically configured** in the laun
153159
| `OAUTH2_GOOGLE_CLIENT_ID` | Google OAuth2 client ID | `your-google-client-id` |
154160
| `OAUTH2_GOOGLE_CLIENT_SECRET` | Google OAuth2 client secret | `your-google-client-secret` |
155161
| `OAUTH2_GOOGLE_REDIRECT_URL` | Google OAuth2 redirect URL | `http://localhost:8080/auth/google/callback` |
162+
| `REDIS_URL` | Redis connection URL | `redis://localhost:6379` |
163+
| `REDIS_PASSWORD` | Redis authentication password | `` (empty for dev) |
164+
| `REDIS_DB` | Redis database number | `0` |
165+
166+
**📝 Redis Cache Notes:**
167+
- **Redis** is used for caching inference models and improving performance
168+
- Cache keys are automatically managed by the application
169+
- Redis connection is required for optimal performance
156170

157171
**Note**: You can modify these values in `.vscode/launch.json` if needed for your environment.
158172

159173
## 🐛 Troubleshooting
160174

161-
### Database Connection Issues
175+
### Database & Redis Connection Issues
162176
1. **Check Docker**: Ensure Docker Desktop is running
163-
2. **Check Port**: Make sure port 5432 is available
177+
2. **Check Ports**: Make sure ports 5432 (PostgreSQL) and 6379 (Redis) are available
164178
3. **View Database Status**: Use Command Palette → "Tasks: Run Task" → "Wait for Database"
165-
4. **View Logs**: Check the integrated terminal for database startup logs
179+
4. **View Redis Status**: Use Command Palette → "Tasks: Run Task" → "Wait for Redis"
180+
5. **View Logs**: Check the integrated terminal for database and Redis startup logs
181+
6. **Redis Connection**: Ensure Redis is running and accessible on the configured port
166182

167183
### Go Extension Issues
168184
1. **Install Go Extension**: VS Code/Cursor should prompt you automatically

apps/jan-api-gateway/README.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ A comprehensive API gateway for Jan Server that provides OpenAI-compatible endpo
1111
- **Authentication & Authorization**: JWT-based auth with Google OAuth2 integration and role-based access control
1212
- **API Key Management**: Secure API key generation and management at organization and project levels with multiple key types (admin, project, organization, service, ephemeral)
1313
- **Model Registry**: Dynamic model endpoint management with automatic health checking and service discovery
14+
- **Cache Service**: High-performance caching for inference models using Redis to reduce load times and improve response performance
1415
- **Streaming Support**: Real-time streaming responses with Server-Sent Events (SSE) and chunked transfer encoding
1516
- **MCP Integration**: Model Context Protocol support for external tools and resources with JSON-RPC 2.0
1617
- **Web Search**: Serper API integration for web search capabilities via MCP with webpage fetching
@@ -78,6 +79,7 @@ A comprehensive API gateway for Jan Server that provides OpenAI-compatible endpo
7879
- Automatic migrations with Atlas
7980
- Generated query interfaces with GORM Gen
8081
- **Authentication**: JWT v5.3.0 + Google OAuth2 v3.15.0
82+
- **Caching**: Redis v9.14.0 for high-performance model caching
8183
- **API Documentation**: Swagger/OpenAPI v1.16.6
8284
- **Streaming**: Server-Sent Events (SSE) with chunked transfer
8385
- **Dependency Injection**: Google Wire v0.6.0
@@ -228,6 +230,45 @@ A comprehensive API gateway for Jan Server that provides OpenAI-compatible endpo
228230
| `SMTP_PASSWORD` | SMTP password | `your-smtp-password` |
229231
| `SMTP_SENDER_EMAIL` | Default sender email address | `noreply@yourdomain.com` |
230232
| `INVITE_REDIRECT_URL` | Redirect URL for invitation acceptance | `http://localhost:8080/invite/accept` |
233+
| `REDIS_URL` | Redis connection URL | `redis://localhost:6379` |
234+
| `REDIS_PASSWORD` | Redis authentication password | `` (empty for dev) |
235+
| `REDIS_DB` | Redis database number | `0` |
236+
237+
## 🚀 Redis Caching
238+
239+
The Jan API Gateway includes Redis caching for inference models to significantly improve performance by avoiding repeated model loading and caching identical requests.
240+
241+
### Redis Features
242+
- **Model List Caching**: Cache model discovery for 10 minutes
243+
- **Transparent Integration**: No code changes needed in existing handlers
244+
- **Centralized Constants**: Redis cache keys defined as constants
245+
246+
### Quick Setup
247+
248+
1. **Deploy Redis Infrastructure**:
249+
```bash
250+
helm dependency update charts/umbrella-chart/
251+
helm install jan-server charts/umbrella-chart/
252+
```
253+
254+
2. **Environment Variables**:
255+
```bash
256+
REDIS_URL=redis://jan-server-redis-master:6379
257+
REDIS_PASSWORD="" # Empty for dev
258+
REDIS_DB=0
259+
```
260+
261+
3. **Verify Setup**:
262+
```bash
263+
# Check Redis connectivity in logs
264+
kubectl logs deployment/jan-server-jan-api-gateway | grep "Successfully connected to Redis"
265+
```
266+
267+
### Performance Benefits
268+
- **Reduced latency** for model discovery calls
269+
- **Reduced CPU usage** by avoiding repeated model loading
270+
- **Better scalability** with reduced backend load
271+
- **Improved user experience** with faster response times
231272

232273
## 📚 API Usage Examples
233274

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
package cron
2+
3+
import (
4+
"context"
5+
6+
"github.com/mileusna/crontab"
7+
inference_model_registry "menlo.ai/jan-api-gateway/app/domain/inference_model_registry"
8+
janinference "menlo.ai/jan-api-gateway/app/utils/httpclients/jan_inference"
9+
"menlo.ai/jan-api-gateway/config/environment_variables"
10+
)
11+
12+
type CronService struct {
13+
JanInferenceClient *janinference.JanInferenceClient
14+
InferenceModelRegistry *inference_model_registry.InferenceModelRegistry
15+
}
16+
17+
func NewService(janInferenceClient *janinference.JanInferenceClient, registry *inference_model_registry.InferenceModelRegistry) *CronService {
18+
return &CronService{
19+
JanInferenceClient: janInferenceClient,
20+
InferenceModelRegistry: registry,
21+
}
22+
}
23+
24+
func (cs *CronService) Start(ctx context.Context, ctab *crontab.Crontab) {
25+
// Run initial check
26+
cs.InferenceModelRegistry.CheckInferenceModels(ctx)
27+
28+
ctab.AddJob("* * * * *", func() {
29+
cs.InferenceModelRegistry.CheckInferenceModels(ctx)
30+
31+
// Reload environment variables
32+
environment_variables.EnvironmentVariables.LoadFromEnv()
33+
})
34+
}

apps/jan-api-gateway/application/app/domain/healthcheck/healthcheck_service.go

Lines changed: 0 additions & 49 deletions
This file was deleted.

0 commit comments

Comments
 (0)