Skip to content

Latest commit

 

History

History
155 lines (98 loc) · 5.16 KB

File metadata and controls

155 lines (98 loc) · 5.16 KB

RAG and LLM Evaluation

3.0 Content update explanation video

  • 3.4 used to be "Ranking evaluation: vector search", but you will do it in homework
  • Videos 3.5 - 3.8 used to be a part of module 4, but now they are a part of the evaluation module (module 4 focuses only on monitoring)
  • The data files for retrieval evaluation are in the search_evaluation folder
  • The data files for RAG evaluation are in the rag_evaluation folder
    • We kept the old data files - the ones generated using this old code
    • In the new notebook, you have minsearch instead of elasticsearch
  • Also, install the sentence transformers library, we will use it for generating embeddings in some of the videos
    pip install sentence-transformers

3.1 Introduction

Plan for the section:

  • Why do we need evaluation
  • Evaluation metrics
  • Ground truth / gold standard data
  • Generating ground truth with LLM
  • Evaluating the search resuls

Note: in 2025 edition, we use Qdrant for performing vector search (not Elastic Search).

For more details, see Module 2.

3.2 Getting ground truth data

  • Approaches for getting evaluation data
  • Using OpenAI to generate evaluation data

Links:

3.3 Ranking evaluation: text search

  • Elasticsearch with text results
  • minsearch

Links:

3.4 Evaluating Vector Search

That's homework

3.5 Offline vs Online (RAG) evaluation

  • Modules recap
  • Online vs offline evaluation
  • Offline evaluation metrics

3.6 Generating data for offline RAG evaluation

Note: We talk about using ElasticSearch, but it's from 2024. Skip to 03:40.

When following the video, use the new code in the notebook.

Links:

3.7 Offline RAG evaluation: cosine similarity

Content

  • A->Q->A' cosine similarity
  • Evaluating gpt-4o
  • Evaluating gpt-3.5-turbo
  • Evaluating gpt-4o-mini

Links:

3.8 Offline RAG evaluation: LLM as a judge

  • LLM as a judge
  • A->Q->A' evaluation
  • Q->A evaluation

Links:

Homework

See here

Notes

Cohort 2025| Study notes and FAQ : LLM Evaluation

  • Did you take notes? Add them above this line (Send a PR with links to your notes)