Skip to content

Commit d903ed7

Browse files
authored
[RHIDP-13061] Add Rate Limiting for Lightspeed Plugin (#3531)
* feat: add rate limiting for lightspeed and notebooks Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * update lightspeed-backend readme with rate limiting info Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * update changeset to only backend Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * extend express request type with rate limit Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * handle negative and float values for rate limit Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * refactor(notebooks): move permission check to route level to preserve rate limit -> permission check ordering to reduce redundant permission checks Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * migrate from lightspeed.* to intelligent-assistant-* Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> * add example block to lightspeed-backend app-config for rate limiting Signed-off-by: Jordan Dubrick <jdubrick@redhat.com> --------- Signed-off-by: Jordan Dubrick <jdubrick@redhat.com>
1 parent 33d8bc7 commit d903ed7

14 files changed

Lines changed: 657 additions & 30 deletions

File tree

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
'@red-hat-developer-hub/backstage-plugin-lightspeed-backend': minor
3+
---
4+
5+
add rate limiting to lightspeed and notebooks

workspaces/lightspeed/plugins/lightspeed-backend/README.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,43 @@ intelligent-assistant:
6565
mcpServers: # Optional - one or more MCP servers
6666
- name: <mcp server name> # must match the name configured in LCS
6767
token: ${MCP_TOKEN}
68+
rateLimit: # Optional - per-user request rate limits (defaults apply if omitted)
69+
expensive:
70+
max: 25 # Max requests per minute per user for expensive endpoints (default: 25). Set to 0 to disable.
71+
general:
72+
max: 200 # Max requests per minute per user for other authenticated endpoints (default: 200). Set to 0 to disable.
73+
```
74+
75+
#### Rate limiting
76+
77+
The backend applies per-user rate limits to authenticated endpoints as an abuse
78+
prevention measure. Limits are keyed by the authenticated user's entity ref and
79+
use a fixed 1-minute window.
80+
81+
**Tiers**:
82+
83+
- **Expensive** (default: 25 requests/minute per user): `POST /v1/query`, and
84+
(when Notebooks is enabled) notebook document uploads and RAG queries.
85+
- **General** (default: 200 requests/minute per user): all other authenticated
86+
endpoints, including conversation listing, MCP server management, feedback,
87+
and notebook session CRUD.
88+
- **Excluded**: `/health` and `/notebooks/health` are not rate limited.
89+
90+
When a limit is exceeded, the API returns `429 Too Many Requests` with a
91+
`Retry-After` header and a JSON error body (`RateLimitExceeded`).
92+
93+
Set `max: 0` on a tier to disable rate limiting for that tier. If the entire
94+
`rateLimit` block is omitted, the defaults above apply.
95+
96+
**Example** — tighter limits for a small deployment:
97+
98+
```yaml
99+
intelligent-assistant:
100+
rateLimit:
101+
expensive:
102+
max: 10
103+
general:
104+
max: 100
68105
```
69106

70107
#### MCP servers settings endpoints

workspaces/lightspeed/plugins/lightspeed-backend/app-config.yaml

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,23 @@
11
# OPTIONAL: Backend-only configurations
2-
#intelligent-assistant:
2+
# intelligent-assistant:
33
# servicePort: 8080 # OPTIONAL: Port for lightspeed-core service (default: 8080)
44
# systemPrompt: <custom_system_prompt> # OPTIONAL: Override default RHDH system prompt
5-
#
5+
# # Optional: Per-user request rate limits (defaults apply if omitted)
6+
# rateLimit:
7+
# expensive:
8+
# max: 10
9+
# general:
10+
# max: 100
611
# # AI Notebooks (Developer Preview) - Disabled by default
712
# notebooks:
813
# enabled: false # Set to true to enable AI Notebooks feature
9-
#
14+
1015
# # REQUIRED when enabled: Query defaults for RAG queries
1116
# # Both model and provider_id must be configured together
1217
# queryDefaults:
1318
# provider_id: ollama # AI provider for query model (e.g., ollama, vllm)
1419
# model: llama3.1-8b-instruct # Model to use for answering queries
15-
#
20+
1621
# # OPTIONAL: Chunking strategy for document processing
1722
# chunkingStrategy:
1823
# type: auto # 'auto' (default) or 'static'

workspaces/lightspeed/plugins/lightspeed-backend/config.d.ts

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,38 @@ export interface Config {
4949
*/
5050
token?: string;
5151
}>;
52+
/**
53+
* Per-user rate limiting for Lightspeed API endpoints.
54+
* @visibility backend
55+
*/
56+
rateLimit?: {
57+
/**
58+
* Limits for expensive endpoints (LLM queries, document uploads).
59+
* @visibility backend
60+
*/
61+
expensive?: {
62+
/**
63+
* Maximum requests per minute per user.
64+
* Set to 0 to disable rate limiting for this tier.
65+
* @default 25
66+
* @visibility backend
67+
*/
68+
max?: number;
69+
};
70+
/**
71+
* Limits for all other authenticated endpoints.
72+
* @visibility backend
73+
*/
74+
general?: {
75+
/**
76+
* Maximum requests per minute per user.
77+
* Set to 0 to disable rate limiting for this tier.
78+
* @default 200
79+
* @visibility backend
80+
*/
81+
max?: number;
82+
};
83+
};
5284
/**
5385
* Configuration for AI Notebooks (Developer Preview)
5486
*/

workspaces/lightspeed/plugins/lightspeed-backend/package.json

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@
5252
"@langchain/openai": "^0.6.0",
5353
"@red-hat-developer-hub/backstage-plugin-lightspeed-common": "workspace:^",
5454
"express": "^4.21.1",
55+
"express-rate-limit": "^8.2.2",
5556
"form-data": "^4.0.5",
5657
"htmlparser2": "^9.1.0",
5758
"http-proxy-middleware": "^3.0.2",

workspaces/lightspeed/plugins/lightspeed-backend/src/service/constant.ts

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,13 @@ export const DEFAULT_LIGHTSPEED_SERVICE_HOST = '127.0.0.1'; // Lightspeed core s
2525
export const DEFAULT_LIGHTSPEED_SERVICE_PORT = 8080; // Lightspeed service port
2626
export const DEFAULT_MAX_FILE_SIZE_MB = 20 * 1024 * 1024; // 20MB
2727

28+
/**
29+
* Rate limiting defaults (window is fixed at 1 minute)
30+
*/
31+
export const RATE_LIMIT_WINDOW_MS = 60000;
32+
export const DEFAULT_EXPENSIVE_RATE_LIMIT_MAX = 25;
33+
export const DEFAULT_GENERAL_RATE_LIMIT_MAX = 200;
34+
2835
/**
2936
* Input validation limits for query endpoints
3037
*/

workspaces/lightspeed/plugins/lightspeed-backend/src/service/express.d.ts

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,13 @@
1616

1717
import type { BackstageCredentials } from '@backstage/backend-plugin-api';
1818

19+
import type { RateLimitInfo } from 'express-rate-limit';
20+
1921
// Populated by the identity middleware for use in route handlers.
2022
declare module 'express-serve-static-core' {
2123
interface Request {
2224
credentials?: BackstageCredentials;
2325
userEntityRef?: string;
26+
rateLimit?: RateLimitInfo;
2427
}
2528
}
Lines changed: 195 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,195 @@
1+
/*
2+
* Copyright Red Hat, Inc.
3+
*
4+
* Licensed under the Apache License, Version 2.0 (the "License");
5+
* you may not use this file except in compliance with the License.
6+
* You may obtain a copy of the License at
7+
*
8+
* http://www.apache.org/licenses/LICENSE-2.0
9+
*
10+
* Unless required by applicable law or agreed to in writing, software
11+
* distributed under the License is distributed on an "AS IS" BASIS,
12+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
* See the License for the specific language governing permissions and
14+
* limitations under the License.
15+
*/
16+
17+
import { mockServices } from '@backstage/backend-test-utils';
18+
19+
import express from 'express';
20+
import request from 'supertest';
21+
22+
import {
23+
DEFAULT_EXPENSIVE_RATE_LIMIT_MAX,
24+
DEFAULT_GENERAL_RATE_LIMIT_MAX,
25+
} from '../constant';
26+
import {
27+
createRateLimitMiddleware,
28+
getRateLimitMax,
29+
} from './createRateLimitMiddleware';
30+
31+
describe('getRateLimitMax', () => {
32+
it('returns default expensive max when config is omitted', () => {
33+
const config = mockServices.rootConfig({ data: {} });
34+
expect(getRateLimitMax(config, 'expensive')).toBe(
35+
DEFAULT_EXPENSIVE_RATE_LIMIT_MAX,
36+
);
37+
});
38+
39+
it('returns default general max when config is omitted', () => {
40+
const config = mockServices.rootConfig({ data: {} });
41+
expect(getRateLimitMax(config, 'general')).toBe(
42+
DEFAULT_GENERAL_RATE_LIMIT_MAX,
43+
);
44+
});
45+
46+
it('returns configured max values when provided', () => {
47+
const config = mockServices.rootConfig({
48+
data: {
49+
'intelligent-assistant': {
50+
rateLimit: {
51+
expensive: { max: 10 },
52+
general: { max: 50 },
53+
},
54+
},
55+
},
56+
});
57+
58+
expect(getRateLimitMax(config, 'expensive')).toBe(10);
59+
expect(getRateLimitMax(config, 'general')).toBe(50);
60+
});
61+
62+
it('treats negative values as disabled (0)', () => {
63+
const config = mockServices.rootConfig({
64+
data: {
65+
'intelligent-assistant': {
66+
rateLimit: {
67+
expensive: { max: -1 },
68+
general: { max: -5 },
69+
},
70+
},
71+
},
72+
});
73+
74+
expect(getRateLimitMax(config, 'expensive')).toBe(0);
75+
expect(getRateLimitMax(config, 'general')).toBe(0);
76+
});
77+
78+
it('floors decimal values to integers', () => {
79+
const config = mockServices.rootConfig({
80+
data: {
81+
'intelligent-assistant': {
82+
rateLimit: {
83+
expensive: { max: 10.7 },
84+
general: { max: 50.3 },
85+
},
86+
},
87+
},
88+
});
89+
90+
expect(getRateLimitMax(config, 'expensive')).toBe(10);
91+
expect(getRateLimitMax(config, 'general')).toBe(50);
92+
});
93+
});
94+
95+
describe('createRateLimitMiddleware', () => {
96+
function createTestApp(
97+
max: number,
98+
tier: 'expensive' | 'general' = 'general',
99+
) {
100+
const app = express();
101+
const config = mockServices.rootConfig({
102+
data: {
103+
'intelligent-assistant': {
104+
rateLimit: {
105+
[tier]: { max },
106+
},
107+
},
108+
},
109+
});
110+
111+
app.use((req, _res, next) => {
112+
req.credentials = { $$type: '@backstage/BackstageCredentials' } as any;
113+
req.userEntityRef = 'user:default/test-user';
114+
next();
115+
});
116+
app.get('/test', createRateLimitMiddleware(config, tier), (_req, res) => {
117+
res.json({ ok: true });
118+
});
119+
120+
return app;
121+
}
122+
123+
it('allows requests up to the configured max', async () => {
124+
const app = createTestApp(1);
125+
126+
const first = await request(app).get('/test');
127+
expect(first.status).toBe(200);
128+
expect(first.body).toEqual({ ok: true });
129+
});
130+
131+
it('returns 429 with Retry-After when limit is exceeded', async () => {
132+
const app = createTestApp(1);
133+
134+
await request(app).get('/test');
135+
const second = await request(app).get('/test');
136+
137+
expect(second.status).toBe(429);
138+
expect(second.headers['retry-after']).toBeDefined();
139+
expect(second.body).toEqual({
140+
error: {
141+
name: 'RateLimitExceeded',
142+
message: 'Too many requests. Please try again later.',
143+
retryAfter: expect.any(Number),
144+
},
145+
});
146+
});
147+
148+
it('does not rate limit when max is 0', async () => {
149+
const app = createTestApp(0);
150+
151+
const first = await request(app).get('/test');
152+
const second = await request(app).get('/test');
153+
154+
expect(first.status).toBe(200);
155+
expect(second.status).toBe(200);
156+
});
157+
158+
it('tracks limits independently per user', async () => {
159+
const config = mockServices.rootConfig({
160+
data: {
161+
'intelligent-assistant': {
162+
rateLimit: {
163+
general: { max: 1 },
164+
},
165+
},
166+
},
167+
});
168+
const app = express();
169+
170+
app.use((req, _res, next) => {
171+
req.credentials = { $$type: '@backstage/BackstageCredentials' } as any;
172+
req.userEntityRef =
173+
req.headers['x-test-user']?.toString() ?? 'user:default/user-a';
174+
next();
175+
});
176+
app.get(
177+
'/test',
178+
createRateLimitMiddleware(config, 'general'),
179+
(_req, res) => {
180+
res.json({ ok: true });
181+
},
182+
);
183+
184+
await request(app).get('/test').set('x-test-user', 'user:default/user-a');
185+
const blockedForA = await request(app)
186+
.get('/test')
187+
.set('x-test-user', 'user:default/user-a');
188+
const allowedForB = await request(app)
189+
.get('/test')
190+
.set('x-test-user', 'user:default/user-b');
191+
192+
expect(blockedForA.status).toBe(429);
193+
expect(allowedForB.status).toBe(200);
194+
});
195+
});

0 commit comments

Comments
 (0)