[FEAT] - RAG Knowledge Base R&D & Evaluation

---
name: Feature Request
title: "[FEAT] - RAG Knowledge Base R&D & Evaluation"
labels: feature, backlog
assignees: @Jrodrigo06 @mdeekshita @SHarg9876 
---

## Summary
Research and evaluate different RAG configurations — chunking strategies, embedding models, and retrieval approaches — to determine the optimal setup for the ingredient suggestion pipeline. Produce metrics and visualizations to back decisions.

## Motivation
The vector store infrastructure and similarity search are already built. Before locking in a RAG configuration for production, we should systematically evaluate our options and have data to back our decisions. This ticket is a research spike — the outputs directly inform the RAG tagging service prompt and configuration.

## Requirements

### Acceptance Criteria

#### Sub-task 1: Document Sourcing & Seeding
- [ ] Identify and collect candidate source documents for our usecase
- [ ] Store raw documents in `backend/data/raw/`
- [ ] Seed `knowledge_chunks` table via `backend/scripts/seed_knowledge.py` (I will hopefully have an endpoint soon enough to directly add and process pdfs so dont worry about this tm)

#### Sub-task 2: Chunking Strategy Experimentation (For this, before starting, you will have to come up with a good prompt to benchmark on for getting food tags or ingredients (I'd recommend the tags) this is a lot so ask questions!)
- [ ] Research and implement at least 3 different chunking strategies — document tradeoffs of each before implementing
- [ ] Re-seed vector store for each strategy
- [ ] Run fixed set of food queries against each and record retrieval results
- [ ] Plot precision@k and MRR across strategies
- [ ] Document which strategy performs best for autoimmune-specific queries

#### Sub-task 3: Embedding Model Comparison
- [ ] Research and select at least 3 embedding models to benchmark — document why each was chosen
- [ ] Embed same knowledge base chunks with each model
- [ ] Run same fixed query set and score top-k results against hand-labeled ground truth
- [ ] Produce comparison table and plot of scores per model

#### Sub-task 4: Embedding Space Exploration (Maybe fade/skip if its too much but some real cool work here ngl!)
- [ ] Extract all chunk embeddings from pgvector
- [ ] Reduce to 2D using UMAP and t-SNE (Other dimensionality reduction techs may work too!)
- [ ] Visualize and label clusters by trigger category
- [ ] Analyze whether autoimmune trigger categories naturally separate in embedding space
- [ ] Document findings — clean clusters = good model fit, messy = retrieval likely struggling

#### Sub-task 5: Findings Write-up & Slide Deck (MUST HAVE!)
- [ ] Short slide deck summarizing: sources chosen, chunking comparison, embedding model comparison, cluster visualizations, and final recommendations
- [ ] Recommendations feed directly into RAG tagging service configuration
### Out of Scope
- RAG tagging service implementation — separate ticket
- Frontend


## Technical Approach

### Affected Areas
- `backend/scripts/seed_knowledge.py` 
- `backend/data/raw/` (new)
- `backend/scripts/evaluate_retrieval.py` (new) — runs query eval harness
- `backend/notebooks/` (new) — Jupyter notebooks for plots and visualizations

- 

### Dependencies
### Dependencies
- Depends on #52 
- Branch off `feat/llm-rag-pipeline` — `KnowledgeChunk` model, pgvector, and similarity search live there
- Add `sentence-transformers`, `pypdf`, `umap-learn`, `matplotlib` via `uv add`

## Testing Notes
PLEASE TEST AND SHOW IT WORKS we don't have much time so testing so cruicial to prove it works and prevents lingering bugs



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEAT] - RAG Knowledge Base R&D & Evaluation #53

name: Feature Request
title: "[FEAT] - RAG Knowledge Base R&D & Evaluation"
labels: feature, backlog
assignees: @Jrodrigo06 @mdeekshita @SHarg9876

Summary

Motivation

Requirements

Acceptance Criteria

Sub-task 1: Document Sourcing & Seeding

Sub-task 2: Chunking Strategy Experimentation (For this, before starting, you will have to come up with a good prompt to benchmark on for getting food tags or ingredients (I'd recommend the tags) this is a lot so ask questions!)

Sub-task 3: Embedding Model Comparison

Sub-task 4: Embedding Space Exploration (Maybe fade/skip if its too much but some real cool work here ngl!)

Sub-task 5: Findings Write-up & Slide Deck (MUST HAVE!)

Out of Scope

Technical Approach

Affected Areas

Dependencies

Dependencies

Testing Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEAT] - RAG Knowledge Base R&D & Evaluation #53

Description

name: Feature Request title: "[FEAT] - RAG Knowledge Base R&D & Evaluation" labels: feature, backlog assignees: @Jrodrigo06 @mdeekshita @SHarg9876

Summary

Motivation

Requirements

Acceptance Criteria

Sub-task 1: Document Sourcing & Seeding

Sub-task 2: Chunking Strategy Experimentation (For this, before starting, you will have to come up with a good prompt to benchmark on for getting food tags or ingredients (I'd recommend the tags) this is a lot so ask questions!)

Sub-task 3: Embedding Model Comparison

Sub-task 4: Embedding Space Exploration (Maybe fade/skip if its too much but some real cool work here ngl!)

Sub-task 5: Findings Write-up & Slide Deck (MUST HAVE!)

Out of Scope

Technical Approach

Affected Areas

Dependencies

Dependencies

Testing Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

name: Feature Request
title: "[FEAT] - RAG Knowledge Base R&D & Evaluation"
labels: feature, backlog
assignees: @Jrodrigo06 @mdeekshita @SHarg9876