Skip to content

The goal is to develop a unified framework for evaluating LLMs, agents, and RAG systems across well-known and custom benchmarks, while providing users with statistical tools to understand and improve their systems.

Notifications You must be signed in to change notification settings

VascoSch92/bench-lab

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bench Lab (WIP)


This is a very early stage of the project.

Still a lot to clean and fix.

The goal is to develop a unified framework for evaluating LLMs, agents, and RAG systems across well-known and custom benchmarks, while providing users with statistical tools to understand and improve their systems.

About

The goal is to develop a unified framework for evaluating LLMs, agents, and RAG systems across well-known and custom benchmarks, while providing users with statistical tools to understand and improve their systems.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages