Skip to content
Farid Vatani edited this page May 13, 2025 · 4 revisions

Project Wiki — Smart URL‐Based Search

This document provides a deep-dive into the technical architecture and rationale behind the Smart Search implementation using Next.js, MongoDB Atlas, OpenAI embeddings, and modern search UX patterns.

📐 Architecture Overview

This project implements a modular, scalable, AI-augmented search system using:

  • URL-based state — Shareable, bookmarkable search results via query params
  • Hybrid Search — Combines traditional full-text and fuzzy search with vector semantic search
  • Vector Embeddings — Search beyond keywords using meaning (via OpenAI + MongoDB vector index)
  • Streaming UI — Server Components + Suspense for smooth UX without unnecessary hydration
  • Auto-complete — Type-ahead experience with MongoDB Atlas' autocomplete pipeline

🔧 MongoDB Atlas Setup

1. Create a Free Cluster

  • Go to: https://cloud.mongodb.com/
  • Create an M0 cluster (free tier)
  • Name your DB (e.g., mydb), and create a collection (e.g., recipes)

2. Create Database User

  • Go to Database Access
  • Add a user with read/write access
  • Use this in your .env as DATABASE_URL

3. Network Access

  • Whitelist your IP (0.0.0.0/0 for local testing)

🧮 Index Configuration

Go to Search → Indexes in MongoDB Atlas

🅰️ Default Full-Text Search

  • Create a Search Index on recipes
  • Default mappings: title, description as strings

🅱️ Autocomplete Index

  • Edit your default index
  • Set titleAutocomplete
  • Rebuild the index

🆎 Vector Search Index

  • Create a new Vector Search Index
  • Collection: recipes
  • Field: embeddings
  • Dimensions: 1536 (for text-embedding-3-small)
  • Similarity: Euclidean or Cosine

🌐 High-Level Overview

User Input
   ↓
<searchBar.tsx> (Client Component)
   ↓
URL updates via useRouter().replace()
   ↓
<page.tsx> (Server Component)
   ↓
<recipeList.tsx> (Server Component w/ Suspense)
   ↓
MongoDB Search (Text, Fuzzy, or Vector via Prisma or $aggregate)
   ↓
Render Results

🧭 Directory Structure

src/
├── app/                 # App Router structure (Next.js 15+)
│   ├── api/autocomplete/route.ts  # Suggestion API
│   ├── globals.css      # Global styles
│   ├── layout.tsx       # Shared layout
│   ├── page.tsx         # Root page – manages query param extraction
│
├── components/          # Reusable UI components
│   ├── AutoCompleteBox.tsx
│   ├── loading.tsx      # Suspense fallback
│   ├── recipeList.tsx   # Renders search results (Server Component)
│   └── searchBar.tsx    # Handles input, debounce, and query state  
│
├── hooks/               # Custom React hooks
│   ├── useClickOutside.ts
│   └── useDebounce.ts
│
├── lib/                 # Shared utilities
│   ├── db.ts            # Prisma client instance
│   └── embeddings.ts    # OpenAI vector generation
│
├── prisma/
│   ├── schema.prisma    # MongoDB schema definition for Prisma
│   └── seed.ts          # Initial recipe seeding
│
├── scripts/
│   └── generate-embeddings.ts  # Populate vector embeddings

🔍 Search Query Execution

1. Full-Text Search

db.recipe.findMany({
  where: {
    description: {
      contains: query,
      mode: "insensitive",
    },
  },
});

2. Fuzzy Full-Text (Atlas)

await db.$runCommandRaw({
  aggregate: "recipe",
  pipeline: [
    {
      $search: {
        index: "default",
        text: {
          query,
          path: ["title", "description"],
          fuzzy: { maxEdits: 2 },
        },
      },
    },
  ],
});

3. Vector Search (OpenAI)

const embedding = await getEmbedding(query);
await db.$runCommandRaw({
  aggregate: "recipe",
  pipeline: [
    {
      $vectorSearch: {
        index: "vector-index",
        path: "embeddings",
        queryVector: embedding,
        numCandidates: 100,
        limit: 10,
      },
    },
  ],
});

🧠 Vector Embeddings

This project uses OpenAI's Embedding API to convert recipe descriptions and user queries into vectors.

🔄 Workflow

  1. Seed recipes (pnpm seed)

  2. Generate embeddings (pnpm embed)

  • Each document gets a embeddings: number[] field
  • Stored alongside the title & description
  1. On search:
  • Query is sent to OpenAI to generate query vector
  • MongoDB runs $vectorSearch to retrieve the most relevant recipes

🔍 Search Types

Feature Tech Used Fallbacks
Full-text $search on MongoDB Prisma string filter (if needed)
Fuzzy Match $search + fuzzy Misspell-tolerant
Autocomplete $search: autocomplete Debounced via useDebounce
Vector Search $vectorSearch on embeddings Optional fallback to text search

🧰 Tools & Hooks

  • useDebounce.ts — Prevents rapid-fire API calls from keystrokes
  • useClickOutside.ts — Closes autocomplete dropdown outside of autocomplete box
  • db.ts — Ensures Prisma client is singleton in dev
  • embeddings.ts — Thin wrapper around OpenAI embedding endpoint

⚙️ Prisma Schema (MongoDB)

model Recipe {
  id          String   @id @default(auto()) @map("_id") @db.ObjectId
  title       String 
  description String
  embeddings  Float[] // Store embeddings as an array of floats 
  createdAt   DateTime @default(now())
  updatedAt   DateTime @updatedAt

  @@map("recipes")
}

🔄 API Route: /api/autocomplete

Handles GET requests like:

/api/autocomplete?q=chick

Uses $search: autocomplete stage to return recipe title suggestions based on partial input.

🔌 External Services

OpenAI Embeddings

  • Model: text-embedding-3-small
  • Used for:
    • Embedding descriptions on ingestion (generate-embeddings.ts)
    • Embedding user queries at runtime (in recipeList.tsx)

MongoDB Atlas

  • Search Index (default): full-text and autocomplete
  • Vector Index: stored on embeddings field

📌 Edge Cases Handled

  • Input clearing updates URL and resets result
  • Debounced autocomplete avoids request flooding
  • Suspense fallback handles real-world loading times
  • key prop forces re-trigger of Server Components on query change

✅ Benefits of URL-based State

Benefit Description
Shareable Users can copy/paste exact search queries
Bookmarkable Come back to same state instantly
Consistent SSR Works naturally with server-side rendering
SEO-friendly Each query is its own routeable resource

📎 Deployment Notes

  • MongoDB Atlas connection string must be secured in .env
  • OpenAI API keys must be rate-limited and protected server-side
  • Production deployments (e.g., Vercel) must enable necessary environment variables