You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-**Prefix**: Every route file uses `APIRouter(prefix="/api/v1")`. Individual route decorators use relative paths (e.g., `@router.post("/recommend")`), **not** full paths.
221
+
-**Health check exception**: `/health` stays at root with no prefix (standard for load balancer probes). This is the only endpoint outside `/api/v1/`.
222
+
-**Versioning**: All endpoints are under `/api/v1/`. When a v2 is needed, add new route files with `prefix="/api/v2"`.
223
+
-**Naming**: Use kebab-case for multi-word paths (e.g., `/deploy-to-cluster`, `/ranked-recommend-from-spec`).
224
+
-**When adding a new route file**: Set `prefix="/api/v1"` on the `APIRouter` and use relative paths in all decorators. Register the router in `backend/src/api/routes/__init__.py` and include it in `backend/src/api/app.py`.
225
+
199
226
### Common Editing Patterns
200
227
201
228
**Adding a new use case template**:
@@ -214,6 +241,13 @@ The recommendation engine uses **multi-criteria scoring** to rank configurations
214
241
5. Update dashboard example if applicable
215
242
6. Update docs/architecture-diagram.md data model ERD
216
243
244
+
**Adding a new API endpoint**:
245
+
1. Add the route to the appropriate file in `backend/src/api/routes/` (or create a new route file)
246
+
2. Use a relative path in the decorator (e.g., `@router.get("/my-endpoint")`) — the `/api/v1` prefix comes from the router
247
+
3. If creating a new route file, set `APIRouter(prefix="/api/v1")` and register it in `routes/__init__.py` and `app.py`
248
+
4. Update `ui/app.py` if the UI calls the new endpoint
249
+
5. Update documentation (docs/DEVELOPER_GUIDE.md, docs/ARCHITECTUREv2.md) with the new endpoint
250
+
217
251
**Adding a new component**:
218
252
1. Add numbered section to docs/ARCHITECTURE.md (maintain sequential numbering)
219
253
2. Update "Architecture Components" count in Overview
@@ -294,7 +328,7 @@ The system now supports two deployment modes:
294
328
-**Purpose**: GPU-free development and testing on local machines
295
329
-**Location**: `simulator/` directory contains the vLLM simulator service
296
330
-**Docker Image**: `vllm-simulator:latest` (single image for all models)
297
-
-**Configuration**: Set `DeploymentGenerator(simulator_mode=True)` in `backend/src/api/routes.py`
331
+
-**Configuration**: Set `DeploymentGenerator(simulator_mode=True)` in `backend/src/api/dependencies.py`
298
332
-**Benefits**:
299
333
- No GPU hardware required
300
334
- Fast deployment (~10-15 seconds to Ready)
@@ -304,7 +338,7 @@ The system now supports two deployment modes:
304
338
305
339
### Real vLLM Mode (Production)
306
340
-**Purpose**: Actual model inference with GPUs
307
-
-**Configuration**: Set `DeploymentGenerator(simulator_mode=False)` in `backend/src/api/routes.py`
341
+
-**Configuration**: Set `DeploymentGenerator(simulator_mode=False)` in `backend/src/api/dependencies.py`
308
342
-**Requirements**:
309
343
- GPU-enabled Kubernetes cluster
310
344
- NVIDIA GPU Operator installed
@@ -332,7 +366,7 @@ The system now supports two deployment modes:
332
366
333
367
### Technical Details
334
368
335
-
The deployment template (`backend/src/deployment/templates/kserve-inferenceservice.yaml.j2`) uses Jinja2 conditionals:
369
+
The deployment template (`backend/src/configuration/templates/kserve-inferenceservice.yaml.j2`) uses Jinja2 conditionals:
336
370
-`{% if simulator_mode %}` - Uses `vllm-simulator:latest`, no GPU resources, fast health checks
0 commit comments