Skip to content

LucaBEMan/rag-evaluation

Repository files navigation

RAG Evaluation Project

This project evaluates different Retrieval-Augmented Generation (RAG) approaches on biomedical datasets (BioASQ, PubMedQA) using the OpenWebUI deployment of the University of Freiburg.

Prerequisites

  • Python 3.11
  • Uni Freiburg VPN: An active VPN connection to the University of Freiburg is required to access the OpenWebUI deployment (https://openwebui.uni-freiburg.de).
  • Docker: Required to run the Graph RAG pipeline (Neo4j).

Installation

  1. Clone the repository.
  2. Create and activate a virtual environment:
    python3.11 -m venv venv
    source venv/bin/activate
  3. Install dependencies:
    pip install -r requirements.txt

Configuration (config.yaml)

The entire setup is centrally controlled via the config.yaml file. Here you can:

  • Select Dataset: Change active_dataset to bioasq or pubmedqa.
  • Configure Experiments: Adjust paths and parameters for the various RAG pipelines.

Example:

active_dataset: "bioasq"
# ...

The scripts automatically load the configuration regardless of the directory from which they are started.

Graph RAG Setup (Neo4j)

A Neo4j database is required for the Graph RAG pipeline (scripts in the graph_rag/ folder). Start it with Docker using the following command to enable the necessary APOC plugins:

docker run \
    -p 7474:7474 -p 7687:7687 \
    -v $PWD/data:/data -v $PWD/plugins:/plugins \
    --name neo4j-apoc \
    -e NEO4J_apoc_export_file_enabled=true \
    -e NEO4J_apoc_import_file_enabled=true \
    -e NEO4J_apoc_import_file_use__neo4j__config=true \
    -e NEO4JLABS_PLUGINS=\[\"apoc\"\] \
    neo4j:latest

Note: The password used for this setup is mygraph12345. Please use this password when connecting (default user is usually neo4j).

This setup is based on the LlamaIndex GraphRAG v2 Cookbook.

Usage

Execute the desired evaluation scripts. Ensure that config.yaml is correctly configured.

  • Baseline (No RAG): python run_ai_without_rag.py
  • Vector RAG: python run_vector_rag_bioasq.py
  • Graph RAG:
    • Extraction/Construction: python graph_rag/run_graph_rag.py
    • Community Detection: python graph_rag/run_community_local.py
  • Plotting: python plotting/plot_retrieval_metrics.py

Code Attribution

The files graph_rag/graph_extractor.py and graph_rag/graph_rag_store.py were adapted from the LlamaIndex GraphRAG v2 Cookbook to work with this project and the centralized configuration.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages