VERGE: Verification-Enhanced Generation of Multi-Hop Datasets for Evaluating Task-Specific RAG

Figure: VERGE dataset generation process

Overview

This repository contains the implementation of VERGE, a verification-enhanced methodology for generating multi-hop datasets to evaluate Retrieval-Augmented Generation (RAG) systems. VERGE addresses significant methodological gaps in existing RAG evaluation frameworks by generating task-specific, multi-hop reasoning dataset.

🌟 Key Features

VERGE: Implements a novel verification agent that ensures generated questions necessitate genuine multi-hop reasoning and maintain factual consistency
Hierarchical Error Taxonomy: Provides structured analysis of RAG system failure patterns specifically in multi-hop reasoning contexts

Repository Structure

Chunker/: Scripts for chunking documents
Data/: Scripts for downloading the datasets
ExamProcesser: Scripts for generated exam processor
Solver: Scripts for solving the generated exams
categorise_errors.py: Scripts for categorise the error type
generate_exam: Scripts for generating an exam
prompt_templates.py: Prompting templates for question generation, verification, and evaluation
retriever.py: Retriever class

🚀 Quick Start

Installation

pip install -r requirements.txt

Usage

Download data

python src/Data/long_bench_downloader.py
python src/Data/download_documents_sec_filings.py

Chunk, Embed and Store the data

python src/Chunker/document_chunker.py

Generate Multi-hop Datasets with Verification Agent

python src/generate_exam.py

Solve the exam

python src/Solver/solve_exam_rag.py

Categorise Error Patterns

python src/categorise_errors.py

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
imgs		imgs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VERGE: Verification-Enhanced Generation of Multi-Hop Datasets for Evaluating Task-Specific RAG

Overview

🌟 Key Features

Repository Structure

🚀 Quick Start

Installation

Usage

Download data

Chunk, Embed and Store the data

Generate Multi-hop Datasets with Verification Agent

Solve the exam

Categorise Error Patterns

License

About

Uh oh!

Releases

Packages

Languages

License

kyosek/VERGE

Folders and files

Latest commit

History

Repository files navigation

VERGE: Verification-Enhanced Generation of Multi-Hop Datasets for Evaluating Task-Specific RAG

Overview

🌟 Key Features

Repository Structure

🚀 Quick Start

Installation

Usage

Download data

Chunk, Embed and Store the data

Generate Multi-hop Datasets with Verification Agent

Solve the exam

Categorise Error Patterns

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages