Bench Lab (WIP)

This is a very early stage of the project.

Still a lot to clean and fix.

The goal is to develop a unified framework for evaluating LLMs, agents, and RAG systems across well-known and custom benchmarks, while providing users with statistical tools to understand and improve their systems.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.idea		.idea
benchlab		benchlab
tests		tests
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Bench Lab (WIP)

About

Uh oh!

Releases

Packages

Languages

VascoSch92/bench-lab

Folders and files

Latest commit

History

Repository files navigation

Bench Lab (WIP)

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages