LLM Leaderboard Explorer

An interactive dashboard for exploring and visualizing merged data from LLM leaderboards, built with Gradio. Check it out our deployed HuggingFace Space: Link YouTube Video about GTLLMZoo available here: Link

📊 Overview

This application provides an interactive interface to view, filter, and compare Large Language Models (LLMs) based on aggregated data from prominent leaderboard sources:

LiveBench: Features performance metrics like Global Average, Reasoning, Coding, Mathematics, Data Analysis, Language, and Instruction Following scores.
LMSYS Chatbot Arena: Includes community-based Elo ratings (Arena Score), rankings, and voting data.

The dashboard allows users to easily navigate and compare models across various metrics and categories.

✨ Features

Interactive Data Tables: View LLM data organized into tabs:
- Performance Metrics: Core benchmark scores from LiveBench.
- Model Details: Information like Organization, License, Knowledge Cutoff, and links.
- Community Stats: Data from the Chatbot Arena Leaderboard (Ranks, Score, Votes).
- Model Mapping: Shows the unified model name alongside original names from LiveBench and Arena.
Filtering: Dynamically filter the displayed models by:
- Search term (searches Model Name and Organization).
- Organization.
- Minimum Global Average score.
Detailed Model Card: Click on any row in the data tables to view a comprehensive card summarizing all metrics for that specific model.
Visualizations Tab:
- Bar Chart: Compare the top 15 models based on a user-selected metric (e.g., Global Average, Arena Score, Coding Average).
- Radar Chart: Select multiple models (up to 5) to compare their performance profile across key metrics (Reasoning, Coding, Math, Data Analysis, Language, IF Average, and scaled Arena Score).

💾 Data

The application uses a pre-merged CSV file (data/merged_leaderboards.csv) containing data aggregated from the sources mentioned above.

🚀 Getting Started

Prerequisites

Python 3.9+
pip (Python package installer)

Installation

Clone the repository (Optional):
```
# If you have the code in a git repository
git clone <your-repo-url>
cd <your-repo-directory>
```
If you just have the files, navigate to the project directory in your terminal.
Install Dependencies: Create a requirements.txt file with the following content:
```
gradio==4.9.0
pandas
plotly
numpy
```
Then, install the requirements:
```
pip install -r requirements.txt
```

Running the Application

To run the application locally:

python app.py

The application will typically be available at http://127.0.0.1:7860 in your web browser.

📁 Project Structure

GTLLMZoo2
├─ app.py                  # Main Gradio application entry point
├─ requirements.txt        # Python dependencies
├─ data
│  └─ merged_leaderboards.csv # Merged leaderboard data
└─ src
   ├─ data_processing.py  # Data loading and filtering logic
   └─ ui.py               # Gradio UI definition and logic

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request if you have improvements or bug fixes.

📄 License

MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Leaderboard Explorer

📊 Overview

✨ Features

💾 Data

🚀 Getting Started

Prerequisites

Installation

Running the Application

📁 Project Structure

🤝 Contributing

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
data		data
src		src
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

git-disl/GTLLMZoo

Folders and files

Latest commit

History

Repository files navigation

LLM Leaderboard Explorer

📊 Overview

✨ Features

💾 Data

🚀 Getting Started

Prerequisites

Installation

Running the Application

📁 Project Structure

🤝 Contributing

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages