Last Updated: Task 1.3 - Fast Lane Processing (Dec 2024) Token Compaction: 24,627 tokens (102,707 chars)
RAGBase/
├── apps/
│ ├── backend/ # Node.js + Fastify API
│ │ ├── src/
│ │ │ ├── app.ts # Fastify initialization
│ │ │ ├── middleware/
│ │ │ │ └── auth-middleware.ts # Timing-safe API key validation
│ │ │ ├── services/
│ │ │ │ ├── database.ts # Prisma singleton (NEW: Phase 04)
│ │ │ │ ├── hash-service.ts # MD5 hashing
│ │ │ │ └── embedding-service.ts # Vector embeddings
│ │ │ └── routes/
│ │ │ ├── documents/
│ │ │ │ ├── upload-route.ts # File upload + path traversal protection
│ │ │ │ ├── status-route.ts # Document status with Prisma singleton
│ │ │ │ └── list-route.ts # List documents with SafeParse validation
│ │ │ ├── query/
│ │ │ │ └── search-route.ts # Vector search with parameterized queries
│ │ │ └── health-route.ts # Health check endpoint
│ │ ├── prisma/
│ │ │ └── schema.prisma # Database schema
│ │ └── vitest.config.ts # Test configuration
│ └── ai-worker/ # Python worker (Phase 07)
├── tests/
│ ├── helpers/
│ │ └── api.ts # API test utilities
│ ├── integration/
│ │ ├── middleware/
│ │ │ └── auth-middleware.test.ts
│ │ └── routes/
│ │ ├── search-route.test.ts # SQL injection prevention tests
│ │ ├── upload-route.test.ts # Path traversal tests
│ │ └── status-route.test.ts
│ ├── setup/
│ │ └── global-setup.ts # Test environment setup
│ └── fixtures/ # Test data
├── docker/ # Dockerfiles
├── docs/ # Documentation (this directory)
└── plans/ # Implementation plans
File: apps/backend/src/services/database.ts
Implements Prisma Client singleton pattern to prevent connection pool exhaustion:
export function getPrismaClient(): PrismaClient {
if (!prismaInstance) {
prismaInstance = new PrismaClient({
log: process.env.NODE_ENV === 'development'
? ['query', 'warn', 'error']
: ['error'],
});
}
return prismaInstance;
}Why: Multiple PrismaClient instances exhaust connection pool. Singleton ensures:
- Single connection pool across app
- Clean shutdown via
disconnectPrisma() - Environment-aware logging
Adopted by:
status-route.ts(document queries)upload-route.ts(duplicate check)search-route.ts(vector search)list-route.ts(document listing)
File: apps/backend/src/services/hash-service.ts
Provides MD5 hashing for file deduplication. Used in upload-route.ts for:
- Detecting duplicate files
- Generating unique storage paths (prevents filename collisions)
File: apps/backend/src/services/embedding-service.ts
Generates vector embeddings using fastembed (self-hosted ONNX-based):
- Model: sentence-transformers/all-MiniLM-L6-v2 (384-dim)
- Methods:
embed(text)- Single text → vectorembedBatch(texts)- Batch texts → vectors (parallel processing)cosineSimilarity(vec1, vec2)- Compute similarity scorefindSimilar(queryEmbedding, candidates, topK)- Find top-K similar vectors
- Features: Lazy initialization (singleton pattern), batch processing with generators
- Used by: Fast lane processing (upload-route), vector search (search-route)
NEW (Task 1.3): Full batch embedding support for fast lane processing.
File: apps/backend/src/services/chunker-service.ts
Text chunking using LangChain MarkdownTextSplitter:
- Config: 1000-char chunks with 200-char overlap
- Returns: Chunks with metadata (charStart, charEnd, heading, page)
- Heading extraction: Parses markdown headers from chunk content
- Position tracking: Maintains character positions in original text
- Used by: Fast lane processing (upload-route), quality validation
NEW (Task 1.3): Core component of fast lane processing pipeline.
File: apps/backend/src/services/fast-lane-processor.ts
High-level orchestrator for immediate JSON/TXT/MD processing:
- Flow: Chunk → Quality Gate → Embed → Store
- Quality gate: Validates text length and noise ratio before processing
- Database: Uses raw SQL INSERT for chunk storage (pgvector compatibility)
- Error handling: Marks documents as FAILED with reason codes
- Status: Updates from PENDING → PROCESSING → COMPLETED/FAILED
Implementation note: upload-route.ts inlines fast lane logic for tighter control. Service provides reusable pipeline for potential future queue-based fast lane.
File: apps/backend/src/middleware/auth-middleware.ts
Prevents timing attack on API key comparison:
// Constant-time comparison using crypto.timingSafeEqual
if (apiKeyBuffer.length === expectedKeyBuffer.length) {
try {
timingSafeEqual(apiKeyBuffer, expectedKeyBuffer);
isValid = true;
} catch {
isValid = false;
}
}Public Routes (no auth required):
/health- Health check/internal/callback- Worker callback endpoint
All other routes require X-API-Key header.
File: apps/backend/src/routes/documents/upload-route.ts
Prevents directory traversal attacks:
// Validate filename with basename() + length check
const sanitizedFilename = basename(filename);
if (sanitizedFilename !== filename || sanitizedFilename.length === 0 || sanitizedFilename.length > 255) {
return reply.status(400).send({
error: 'INVALID_FILENAME',
message: 'Filename contains invalid characters or exceeds length limit',
});
}
// Store using MD5 hash only (prevents path traversal)
const filePath = path.join(UPLOAD_DIR, md5Hash);Why:
basename()removes path separators- MD5 hash storage prevents arbitrary filesystem paths
- Length limit (255 chars) prevents filesystem issues
File: apps/backend/src/routes/query/search-route.ts
Prevents SQL injection in pgvector queries:
const results = await prisma.$queryRaw<...>`
SELECT ... FROM chunks c
ORDER BY c.embedding <=> ${JSON.stringify(queryEmbedding)}::vector
LIMIT ${topK}
`;Why: Prisma $queryRaw with template literals provides automatic parameter binding. Never concatenates user input directly.
Route: POST /api/documents
File: apps/backend/src/routes/documents/upload-route.ts
Flow:
- Validate file size (50MB max)
- Detect format (pdf, docx, xlsx, json, txt, md, csv)
- Validate filename (no path traversal)
- Calculate MD5 hash
- Check for duplicates via Prisma singleton
- Save file using MD5 hash path
- Create document record with file I/O error handling + rollback
- FAST LANE (NEW Task 1.3): Process JSON/TXT/MD files immediately:
- Read file content
- Chunk text using MarkdownTextSplitter (LangChain)
- Generate embeddings via fastembed (self-hosted ONNX)
- Store chunks + vectors directly in PostgreSQL/pgvector
- Mark document as COMPLETED
- HEAVY LANE: Queue PDF/DOCX for Python worker processing
Fast Lane Processing (Task 1.3):
- Supported formats: JSON, TXT, MD
- Chunking: LangChain MarkdownTextSplitter (1000 char chunks, 200 char overlap)
- Embeddings: fastembed all-MiniLM-L6-v2 (384-dim vectors)
- Storage: Batch insert chunks with raw SQL + pgvector type cast
- Status flow: PENDING → (immediate processing) → COMPLETED/FAILED
- Error handling: Quality gate validation, proper error propagation
Error Handling:
- 400: Invalid file, unsupported format, path traversal attempt
- 409: Duplicate file detected
- 413: File exceeds 50MB (Payload Too Large)
- 500: Storage error (with DB cleanup on failure), fast lane processing errors
Features (Phase 04+):
- File I/O rollback on DB failure (cleanup written file)
- Path traversal protection via
basename()+ MD5 hash - Prisma singleton for connection efficiency
- NEW (Task 1.3): Immediate fast lane processing with embeddings
- NEW (Task 1.3): pgvector integration for semantic search readiness
Route: GET /api/documents/:id
File: apps/backend/src/routes/documents/status-route.ts
Returns document metadata including chunk count (when completed).
New Features (Phase 04):
- SafeParse validation for UUID format
- Prisma singleton for queries
- Proper 400 vs 404 error codes
Route: GET /api/documents
File: apps/backend/src/routes/documents/list-route.ts
Lists all documents with pagination support.
New Features (Phase 04):
- SafeParse validation for query parameters
- Proper error handling (400 for validation, 500 for server errors)
Route: POST /api/query
File: apps/backend/src/routes/query/search-route.ts
Semantic search across document chunks.
Request Body:
{
"query": "search text",
"topK": 5
}Response:
{
"results": [
{
"content": "chunk text",
"score": 0.85,
"documentId": "uuid",
"metadata": {
"charStart": 0,
"charEnd": 100,
"page": 1,
"heading": "Section Title"
}
}
]
}New Features (Phase 04):
- SQL injection prevention via Prisma parameter binding
- Proper 400 vs 503 error codes (validation vs service errors)
Route: GET /health
File: apps/backend/src/routes/health-route.ts
No authentication required. Returns {"status":"ok"}.
File: apps/backend/src/validators/index.ts (via Zod)
- File size ≤ 50MB
- Supported formats: pdf, docx, xlsx, json, txt, md, csv
- Mime type matching
const QuerySchema = z.object({
query: z.string().min(1).max(1000).trim(),
topK: z.number().int().min(1).max(100).default(5),
});All routes use SafeParse for proper error responses (400 with detailed messages).
File: apps/backend/prisma/schema.prisma
Document
id(UUID, PK)filename(String)mimeType(String)fileSize(Int)format(Enum: pdf, docx, xlsx, json, txt, md, csv)lane(Enum: FAST, HEAVY)status(Enum: PENDING, PROCESSING, COMPLETED, FAILED)filePath(String) - MD5-hashed pathmd5Hash(String, unique index) - DeduplicationretryCount(Int, default: 0)failReason(String, nullable)createdAt,updatedAt(DateTime)- Relations:
chunks(1-to-many)
Chunk
id(UUID, PK)documentId(UUID, FK)content(String) - Chunk textembedding(Vector 384d) - pgvector typecharStart,charEnd(Int) - Character position in originalpage(Int, nullable) - Page numberheading(String, nullable) - Markdown heading- Relations:
document(many-to-1)
File: tests/helpers/api.ts
Provides utilities for:
- Setting up test server
- Making authenticated API requests
- Mocking worker responses
File: tests/setup/global-setup.ts
Initializes:
- Testcontainers (PostgreSQL + Redis)
- Database migrations
- Test environment variables
Path Traversal Tests (upload-route.test.ts):
- Reject filenames with path separators
- Reject filenames exceeding 255 chars
SQL Injection Tests (search-route.test.ts):
- Verify parameterized query execution
- Test pgvector query safety
Timing Attack Tests (auth-middleware.test.ts):
- Verify constant-time comparison
- Test public route bypass
Database:
DATABASE_URL- PostgreSQL connection string
File Storage:
UPLOAD_DIR- Directory for file uploads (default:/tmp/uploads)
Security:
API_KEY- Shared secret for API authenticationNODE_ENV- development or production (affects logging)
Processing:
REDIS_URL- Redis connection (Phase 05+)
{
"error": "VALIDATION_ERROR",
"message": "Detailed validation issue"
}{
"error": "UNAUTHORIZED",
"message": "Invalid or missing API key"
}{
"error": "NOT_FOUND",
"message": "Document not found"
}{
"error": "STORAGE_ERROR",
"message": "Failed to save file: ..."
}{
"error": "EMBEDDING_SERVICE_ERROR",
"message": "Failed to generate query embedding: ..."
}# All tests
pnpm test
# Watch mode
pnpm test:watch
# Integration only
pnpm test:integration
# Coverage
pnpm test:coverage# Generate Prisma client
pnpm --filter @ragbase/backend db:generate
# Push schema to DB
pnpm --filter @ragbase/backend db:push
# Create migration
pnpm --filter @ragbase/backend db:migrate# Start services (Docker required)
docker compose up -d
# Run server
pnpm dev
# Verify health
curl http://localhost:3000/health| Decision | Rationale | Implementation |
|---|---|---|
| Prisma Singleton | Prevent connection pool exhaustion | services/database.ts |
| Timing-Safe Auth | Prevent timing attacks on API key | crypto.timingSafeEqual() |
| Path Traversal Protection | Prevent directory escape attacks | basename() + MD5 hash storage |
| SQL Injection Prevention | Use parameterized queries | Prisma $queryRaw with template literals |
| File I/O Rollback | Maintain consistency if DB fails | Cleanup written files on DB errors |
| SafeParse Validation | Proper error codes (400 vs 500) | Zod safeParse() in all routes |
| MD5 Hash Storage | Unique, collision-resistant paths | HashService.md5() for filenames |
| Fast Lane Processing (Task 1.3) | Immediate response for simple formats | Inline chunking + embedding in upload-route |
| Self-Hosted Embeddings (Task 1.3) | No external API dependency | fastembed ONNX model + batch processing |
| Raw SQL for Chunks (Task 1.3) | pgvector type compatibility | Prisma $executeRaw with ::vector cast |
| Dual Lane Architecture (Task 1.3) | Optimize for different file types | Fast lane (JSON/TXT/MD) vs Heavy lane (PDF/DOCX) |
- Phase 05: Queue integration (BullMQ) with proper job retry logic & callback handling
- Phase 06: E2E pipeline testing with Docling (Python worker) processing for PDF/DOCX
- Phase 07: Python AI Worker deployment (Docling → markdown extraction)
- Phase 08: Frontend UI (React + Vite) with upload + search interface
- Phase 09: Production hardening & scaling (monitoring, alerts, load testing)