V0.12.1 by kovtcharov · Pull Request #106 · amd/gaia

kovtcharov · 2025-10-22T06:30:42Z

GAIA v0.12.1 Release Notes

Overview

This patch release focuses on bug fixes and improvements to the evaluation framework, particularly addressing issues with the visualization and reporting tools. All changes improve the reliability and usability of the gaia eval, gaia visualize, and gaia report commands.

What's Changed

Bug Fixes

🔧 Fix Evaluation Visualizer Model Count and Path Issues (#823)

Fixed multiple critical issues in the gaia visualize and gaia report commands:

Incorrect Model Count in Consolidated Report: Fixed model count calculation in the webapp to show the correct number of models (was showing only 4 instead of 8)
- Now calculates unique models directly from metadata.evaluation_files instead of filtered/grouped data
Windows Path Separator Bug: Fixed cross-platform compatibility issue in isMainEvaluationEntry() function
- Now handles both Unix (/) and Windows (\) path separators correctly
Incorrect Default Directory Paths: Updated default paths to match actual evaluation output locations
- Changed from workspace/evaluation to workspace/output/evaluations
- Changed from workspace/experiments to workspace/output/experiments
Outdated Report Filename: Updated default report filename from LLM_RAG_Evaluation_Report.md to LLM_Evaluation_Report.md
- Better reflects support for multiple evaluation types (RAG, summarization, etc.)

Files Changed: src/gaia/cli.py, src/gaia/eval/eval.py, src/gaia/eval/webapp/public/app.js

Improvements

📊 Standardize Evaluation Workflow Default Directories (#820)

Implemented consistent default parameters across all evaluation commands with a unified directory structure:

./output/
├── test_data/          # gaia generate
├── groundtruth/        # gaia groundtruth
├── experiments/        # gaia batch-experiment
└── evaluations/        # gaia eval

Key Changes:

Added centralized directory constants in cli.py
Added GAIA_WORKSPACE environment variable support for flexible workspace management
Updated all command defaults to use the new structure
Updated documentation in docs/eval.md and docs/cli.md

Benefits:

Consistency: All evaluation artifacts organized in one location
Maintainability: Centralized constants eliminate duplication
Flexibility: Workspace environment variable for managing multiple projects
Cleanup: Single directory to clean or ignore

Files Changed: Multiple files including CLI, evaluation modules, webapp components, and documentation

🏷️ Improve Reporting for Cloud Model Identifiers (#834)

Enhanced model counting logic in the Evaluation Visualizer to support additional cloud model identifiers:

Added support for 'gpt-4' and 'gemini' model identifiers
Improved accuracy of model classification in reports

Files Changed: src/gaia/eval/webapp/public/app.js

Contributors

Kalin Ovtcharov (@kalin-ovtcharov)

Upgrade Notes

If you have existing evaluation workflows, note the following directory changes:

./evaluation → ./output/evaluations
./experiments → ./output/experiments

You can set the GAIA_WORKSPACE environment variable to use a custom workspace location if needed.

Full Changelog: v0.12.0...v0.12.1

# GAIA v0.12.1 Release Notes ## Overview This patch release focuses on bug fixes and improvements to the evaluation framework, particularly addressing issues with the visualization and reporting tools. All changes improve the reliability and usability of the `gaia eval`, `gaia visualize`, and `gaia report` commands. ## What's Changed ### Bug Fixes #### 🔧 Fix Evaluation Visualizer Model Count and Path Issues (#823) Fixed multiple critical issues in the `gaia visualize` and `gaia report` commands: - **Incorrect Model Count in Consolidated Report**: Fixed model count calculation in the webapp to show the correct number of models (was showing only 4 instead of 8) - Now calculates unique models directly from `metadata.evaluation_files` instead of filtered/grouped data - **Windows Path Separator Bug**: Fixed cross-platform compatibility issue in `isMainEvaluationEntry()` function - Now handles both Unix (`/`) and Windows (`\`) path separators correctly - **Incorrect Default Directory Paths**: Updated default paths to match actual evaluation output locations - Changed from `workspace/evaluation` to `workspace/output/evaluations` - Changed from `workspace/experiments` to `workspace/output/experiments` - **Outdated Report Filename**: Updated default report filename from `LLM_RAG_Evaluation_Report.md` to `LLM_Evaluation_Report.md` - Better reflects support for multiple evaluation types (RAG, summarization, etc.) **Files Changed**: `src/gaia/cli.py`, `src/gaia/eval/eval.py`, `src/gaia/eval/webapp/public/app.js` ### Improvements #### 📊 Standardize Evaluation Workflow Default Directories (#820) Implemented consistent default parameters across all evaluation commands with a unified directory structure: ``` ./output/ ├── test_data/ # gaia generate ├── groundtruth/ # gaia groundtruth ├── experiments/ # gaia batch-experiment └── evaluations/ # gaia eval ``` **Key Changes**: - Added centralized directory constants in `cli.py` - Added `GAIA_WORKSPACE` environment variable support for flexible workspace management - Updated all command defaults to use the new structure - Updated documentation in `docs/eval.md` and `docs/cli.md` **Benefits**: - Consistency: All evaluation artifacts organized in one location - Maintainability: Centralized constants eliminate duplication - Flexibility: Workspace environment variable for managing multiple projects - Cleanup: Single directory to clean or ignore **Files Changed**: Multiple files including CLI, evaluation modules, webapp components, and documentation #### 🏷️ Improve Reporting for Cloud Model Identifiers (#834) Enhanced model counting logic in the Evaluation Visualizer to support additional cloud model identifiers: - Added support for 'gpt-4' and 'gemini' model identifiers - Improved accuracy of model classification in reports **Files Changed**: `src/gaia/eval/webapp/public/app.js` ## Contributors - Kalin Ovtcharov (@kalin-ovtcharov) ## Upgrade Notes If you have existing evaluation workflows, note the following directory changes: - `./evaluation` → `./output/evaluations` - `./experiments` → `./output/experiments` You can set the `GAIA_WORKSPACE` environment variable to use a custom workspace location if needed. --- **Full Changelog**: v0.12.0...v0.12.1

kovtcharov added 3 commits October 21, 2025 23:01

v0.12.1

334423e

remove files not needed

4815d4c

version bump

c9d03fd

kovtcharov requested review from itomek and vgodsoe October 22, 2025 06:30

kovtcharov self-assigned this Oct 22, 2025

kovtcharov added the release label Oct 22, 2025

github-advanced-security AI found potential problems Oct 22, 2025

View reviewed changes

kovtcharov enabled auto-merge (squash) October 22, 2025 06:37

kovtcharov disabled auto-merge October 22, 2025 06:38

kovtcharov merged commit 84f0fd2 into main Oct 22, 2025
19 of 23 checks passed

kovtcharov deleted the v0.12.1 branch October 22, 2025 06:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V0.12.1#106

V0.12.1#106
kovtcharov merged 3 commits into
mainfrom
v0.12.1

kovtcharov commented Oct 22, 2025

Uh oh!

Uh oh!

Check failure

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kovtcharov commented Oct 22, 2025

GAIA v0.12.1 Release Notes

Overview

What's Changed

Bug Fixes

🔧 Fix Evaluation Visualizer Model Count and Path Issues (#823)

Improvements

📊 Standardize Evaluation Workflow Default Directories (#820)

🏷️ Improve Reporting for Cloud Model Identifiers (#834)

Contributors

Upgrade Notes

Uh oh!

Uh oh!

Check failure

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants