Please refer to our paper
Auto-GDA: Automatic Domain Adaptation for Efficient Grounding Verification in Retrieval-augmented Generation by Tobias Leemann, Periklis Petridis, Giuseppe Vietri, Dionysis Manousakas, Aaron Roth, and Sergul Aydore
for more details about the method.
git clone [email protected]:amazon-science/AutoGDA-Efficient-Grounding-Verification-in-RAG.git
cd AutoGDA-Efficient-Grounding-Verification-in-RAG
Install mysql server which is used to manage hyperparameter configuration and venv or conda (if not already installed)
sudo apt-get install mysql-server pkg-config default-libmysqlclient-dev build-essential python3.10-venv
In the project folder (DomainAdaptationForFactualConsistencyDetection), run
python3 -m venv autogda
source autogda/bin/activate
or using conda run
conda create -n autogda python=3.10
conda activate autogda
Verify that the nvidia-gpus are recognized by running
nvidia-smi
before installation of requirements
if not you can try installing the proper Nvidia Driver like
sudo apt install nvidia-driver-535
Then install die remaining requirements using pip (preferred) or poetry
pip install -r requirements.txt
Optionally, activate the enviorment for usage with Jupyter by running
python -m ipykernel install --user --name autogda
Install mysql server via apt get as described above. Then log into the mysql console:
sudo mysql -u root
and run the comments
CREATE USER optuna@"%";
CREATE DATABASE optuna;
GRANT ALL ON optuna.* TO optuna@"%";
Datasets. First, download the datasets by exectuing the script ./src/scripts/download_datasets.sh
from the main directory as working directory.
LLM APIs. For using the LLM APIs either make sure you have valid credentials for Amazon Bedrock in your ~/.aws
folder or set the OpenAI key in data/openai.json
.
We provide a simple example for running Auto-GDA on the LFQA-Verification dataset.
We recommend first generating the initial data using the notebook notebooks/GenerateSync.ipynb
. Second you can just launch a run of Auto-GDA by running
export PYTHONPATH="."; python3 ./src/sync_data/generation_loop.py config_files/simple_config.json -d lfqa-veri -g all
Please confer experiment_reproduction.md
for additional etails on how to use the method and to run experiments.
Please cite our work if you use this codebase, for instance using the following BibTeX-Entry:
@inproceedings{
leemann2025autogda,
title={Auto-{GDA}: Automatic Domain Adaptation for Efficient Grounding Verification in Retrieval-augmented Generation},
author={Tobias Leemann and Periklis Petridis and Giuseppe Vietri and Dionysis Manousakas and Aaron Roth and Sergul Aydore},
booktitle={The Thirteenth International Conference on Learning Representations},
year={2025},
url={https://openreview.net/forum?id=w5ZtXOzMeJ}
}
This project is licensed under the Apache-2.0 License.