RAG Evaluation Project

This project evaluates different Retrieval-Augmented Generation (RAG) approaches on biomedical datasets (BioASQ, PubMedQA) using the OpenWebUI deployment of the University of Freiburg.

Prerequisites

Python 3.11
Uni Freiburg VPN: An active VPN connection to the University of Freiburg is required to access the OpenWebUI deployment (https://openwebui.uni-freiburg.de).
Docker: Required to run the Graph RAG pipeline (Neo4j).

Installation

Clone the repository.

Create and activate a virtual environment:

python3.11 -m venv venv
source venv/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```

Configuration (`config.yaml`)

The entire setup is centrally controlled via the config.yaml file. Here you can:

Select Dataset: Change active_dataset to bioasq or pubmedqa.
Configure Experiments: Adjust paths and parameters for the various RAG pipelines.

Example:

active_dataset: "bioasq"
# ...

The scripts automatically load the configuration regardless of the directory from which they are started.

Graph RAG Setup (Neo4j)

A Neo4j database is required for the Graph RAG pipeline (scripts in the graph_rag/ folder). Start it with Docker using the following command to enable the necessary APOC plugins:

docker run \
    -p 7474:7474 -p 7687:7687 \
    -v $PWD/data:/data -v $PWD/plugins:/plugins \
    --name neo4j-apoc \
    -e NEO4J_apoc_export_file_enabled=true \
    -e NEO4J_apoc_import_file_enabled=true \
    -e NEO4J_apoc_import_file_use__neo4j__config=true \
    -e NEO4JLABS_PLUGINS=\[\"apoc\"\] \
    neo4j:latest

Note: The password used for this setup is mygraph12345. Please use this password when connecting (default user is usually neo4j).

This setup is based on the LlamaIndex GraphRAG v2 Cookbook.

Usage

Execute the desired evaluation scripts. Ensure that config.yaml is correctly configured.

Baseline (No RAG): python run_ai_without_rag.py
Vector RAG: python run_vector_rag_bioasq.py
Graph RAG:
- Extraction/Construction: python graph_rag/run_graph_rag.py
- Community Detection: python graph_rag/run_community_local.py
Plotting: python plotting/plot_retrieval_metrics.py

Code Attribution

The files graph_rag/graph_extractor.py and graph_rag/graph_rag_store.py were adapted from the LlamaIndex GraphRAG v2 Cookbook to work with this project and the centralized configuration.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
evaluation_dataset_1		evaluation_dataset_1
evaluation_dataset_2/pubmedqa_subcorpus		evaluation_dataset_2/pubmedqa_subcorpus
graph_rag		graph_rag
plotting		plotting
.gitignore		.gitignore
README.md		README.md
config.yaml		config.yaml
eval_ai_only_vs_rag.py		eval_ai_only_vs_rag.py
eval_ragas.py		eval_ragas.py
requirements.txt		requirements.txt
run_ai_without_rag.py		run_ai_without_rag.py
run_vector_rag_bioasq.py		run_vector_rag_bioasq.py
run_vector_rag_pubmedqa.py		run_vector_rag_pubmedqa.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG Evaluation Project

Prerequisites

Installation

Configuration (`config.yaml`)

Graph RAG Setup (Neo4j)

Usage

Code Attribution

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

RAG Evaluation Project

Prerequisites

Installation

Configuration (config.yaml)

Graph RAG Setup (Neo4j)

Usage

Code Attribution

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Configuration (`config.yaml`)

Packages