add blog article

mrlynn · mrlynn · commit 4655df79c60f · 2025-02-22T12:05:21.000-05:00
diff --git a/mongodb-rag-docs/blog/2025-02-22-building-the-docs-chatbot.md b/mongodb-rag-docs/blog/2025-02-22-building-the-docs-chatbot.md
@@ -0,0 +1,196 @@
+# **Building an Intelligent Documentation Assistant with MongoDB-RAG**
+
+### *by Michael Lynn, Developer Advocate @ MongoDB*  
+📌 [GitHub](https://github.com/mrlynn) | 🛠️ [MongoDB-RAG Docs](https://mongodb.github.io/mongo-rag/)
+
+---
+
+## **📖 TL;DR**
+Ever wished your documentation could just *answer questions* directly instead of forcing users to sift through endless pages? That’s exactly what we built with the **MongoDB-RAG Documentation Assistant**. In this article, I’ll walk you through the **why, what, and how** of building a chatbot that retrieves precise, relevant information from MongoDB-RAG’s own documentation.
+
+### **🤔 Why Build a Documentation Assistant?**
+Traditional documentation search is useful, but it often leaves users with *more questions than answers*. Developers need to read through entire pages, infer context, and piece together solutions. Instead, we wanted something:
+
+✅ **Conversational** – Answers questions in natural language  
+✅ **Context-aware** – Finds relevant documentation snippets instead of just keywords  
+✅ **Fast & Accurate** – Uses vector search to surface precise answers  
+✅ **Transparent** – Links to original sources so users can verify answers  
+✅ **Scalable** – Handles multiple LLM providers, including **OpenAI** and **Ollama**  
+
+Our solution? **A chatbot powered by MongoDB-RAG**, showcasing exactly what our tool was built for: **retrieval-augmented generation (RAG)** using **MongoDB Atlas Vector Search**.
+
+---
+
+## **🛠️ How We Built It**
+We structured the assistant around four core components:
+
+### **1️⃣ Document Ingestion**
+To make documentation *searchable*, we need to process it into vector embeddings. We use **semantic chunking** to break long docs into meaningful pieces before ingestion.
+
+```javascript
+const chunker = new Chunker({
+  strategy: 'semantic',
+  maxChunkSize: 500,
+  overlap: 50
+});
+
+const documents = await loadMarkdownFiles('./docs');
+const chunks = await Promise.all(
+  documents.map(doc => chunker.chunkDocument(doc))
+);
+
+await rag.ingestBatch(chunks.flat());
+```
+
+> 📝 **Why Semantic Chunking?** Instead of blindly splitting text, we preserve contextual integrity by overlapping related sections.
+
+---
+
+### **2️⃣ Vector Search with MongoDB Atlas**
+Once ingested, we use **MongoDB Atlas Vector Search** to find the most relevant documentation snippets based on a user’s query.
+
+```javascript
+const searchResults = await rag.search(query, { 
+  maxResults: 6,
+  filter: { 'metadata.type': 'documentation' }
+});
+```
+
+MongoDB’s **$vectorSearch** operator ensures we retrieve the closest matching content, ranked by relevance.
+
+---
+
+### **3️⃣ Streaming Responses for a Real Chat Experience**
+To improve user experience, we stream responses incrementally as they’re generated.
+
+```javascript
+router.post('/chat', async (req, res) => {
+  const { query, history = [], stream = true } = req.body;
+  
+  const context = await ragService.search(query);
+  
+  if (stream) {
+    res.writeHead(200, {
+      'Content-Type': 'text/event-stream',
+      'Cache-Control': 'no-cache',
+      'Connection': 'keep-alive'
+    });
+    
+    await llmService.generateResponse(query, context, history, res);
+  } else {
+    const answer = await llmService.generateResponse(query, context, history);
+    res.json({ answer, sources: context });
+  }
+});
+```
+
+With this approach:
+- Responses appear **in real-time** instead of waiting for full generation 🚀
+- Developers can get **partial answers** quickly while longer responses load
+
+---
+
+### **4️⃣ Multi-Provider LLM Support**
+The assistant supports **multiple embedding providers**, including OpenAI and **self-hosted Ollama**.
+
+```javascript
+const config = {
+  embedding: {
+    provider: process.env.EMBEDDING_PROVIDER || 'openai',
+    model: process.env.EMBEDDING_MODEL || 'text-embedding-3-small',
+    baseUrl: process.env.OLLAMA_BASE_URL // For local deployment
+  }
+};
+```
+
+This allows users to **switch providers** easily, optimizing for performance, privacy, or cost.
+
+---
+
+## **💡 Key Features**
+
+### 🔍 **Real-time Context Retrieval**
+Instead of guessing, the chatbot **searches first** and then generates answers.
+
+### 🔗 **Source Attribution**
+Each response includes a **link to the documentation**, letting users verify answers.
+
+### ⏳ **Streaming Responses**
+No waiting! Answers **generate in real-time**, improving responsiveness.
+
+### ⚙️ **Multi-Provider LLM Support**
+Deploy with **OpenAI for scale** or **Ollama for private, local hosting**.
+
+### 🤖 **Fallback Handling**
+If documentation doesn’t contain an answer, the chatbot **transparently explains the limitation** instead of fabricating responses.
+
+---
+
+## **🚀 Try It Yourself**
+Want to build a **MongoDB-RAG-powered assistant**? Here’s how to get started:
+
+### **1️⃣ Install MongoDB-RAG**
+```bash
+npm install mongodb-rag
+```
+
+### **2️⃣ Configure Your Environment**
+```env
+MONGODB_URI=your_atlas_connection_string
+EMBEDDING_PROVIDER=openai
+EMBEDDING_API_KEY=your_api_key
+EMBEDDING_MODEL=text-embedding-3-small
+```
+
+### **3️⃣ Initialize the Chatbot**
+```javascript
+import { MongoRAG } from 'mongodb-rag';
+import express from 'express';
+
+const rag = new MongoRAG(config);
+const app = express();
+
+app.post('/api/chat', async (req, res) => {
+  const { query } = req.body;
+  const results = await rag.search(query);
+  res.json({ answer: results });
+});
+```
+
+---
+
+## **🌩️ Production Considerations**
+### **Where to Host?**
+We deployed our assistant on **Vercel** for:
+- **Serverless scalability**
+- **Fast global CDN**
+- **Easy Git-based deployments**
+
+### **Which LLM for Production?**
+- **OpenAI** – Best for reliability & speed
+- **Ollama** – Best for **privacy-first** self-hosted setups
+
+```env
+EMBEDDING_PROVIDER=openai
+EMBEDDING_MODEL=text-embedding-3-small
+```
+
+---
+
+## **🔮 What’s Next?**
+Future improvements include:
+- **Better query reformulation** to improve retrieval accuracy
+- **User feedback integration** to refine responses over time
+- **Conversation memory** for context-aware follow-ups
+
+---
+
+## **🎬 Conclusion**
+By combining **MongoDB Atlas Vector Search** with **modern LLMs**, we built an assistant that **transforms documentation into an interactive experience**. 
+
+Try it out in our docs, and let us know what you think! 🚀
+
+### 🔗 **Resources**
+📘 [MongoDB-RAG Docs](https://mongodb.github.io/mongo-rag/)  
+🔗 [GitHub Repository](https://github.com/mongodb-developer/mongodb-rag)  
+📦 [NPM Package](https://www.npmjs.com/package/mongodb-rag)