GitHub - uhh-hcds/g4kmu-paper

T²-RAGBench

T²-RAGBench is a realistic and rigorous benchmark for evaluating Retrieval-Augmented Generation (RAG) systems on financial documents combining text and tables (over 12k Downloaded on Huggingface). It contains 23,088 question-context-answer triples from 7,318 real-world financial reports, focusing on numerical reasoning and retrieval robustness.

Benchmark Subsets

The benchmark comprises four subsets derived from financial datasets:

Subset	Domain	# Documents	# QA Pairs	Avg. Tokens/Doc	Avg. Tokens/Question
FinQA	Finance	2,789	8,281	950.4	39.2
ConvFinQA	Finance	1,806	3,458	890.9	30.9
TAT-DQA	Finance	2,723	11,349	915.3	31.7

You can find more details about the benchmark in our Paper, Website, and on the dataset on Huggingface.

For more details on the benchmark, please refer to our paper, code or write us an email at t2ragbench@gmail.com.

Name		Name	Last commit message	Last commit date
Latest commit History 187 Commits
annotations		annotations
conf		conf
src		src
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
start_vllm_server_as_process.py		start_vllm_server_as_process.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

T²-RAGBench

Benchmark Subsets

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

T²-RAGBench

Benchmark Subsets

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages