Skip to content

Add Gemma4 26B-A4B Automations Evaluation#275

Merged
allenporter merged 3 commits intoallenporter:mainfrom
NickM-27:investigate-gemma4-auto
May 2, 2026
Merged

Add Gemma4 26B-A4B Automations Evaluation#275
allenporter merged 3 commits intoallenporter:mainfrom
NickM-27:investigate-gemma4-auto

Conversation

@NickM-27
Copy link
Copy Markdown
Contributor

@NickM-27 NickM-27 commented May 2, 2026

Update to add Gemma4 26B-A4B automation performance

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 56.88%. Comparing base (6b08d13) to head (3a13886).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #275   +/-   ##
=======================================
  Coverage   56.88%   56.88%           
=======================================
  Files          51       51           
  Lines        2099     2099           
=======================================
  Hits         1194     1194           
  Misses        905      905           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@NickM-27
Copy link
Copy Markdown
Contributor Author

NickM-27 commented May 2, 2026

I guess the problem is it shows the total as 0, maybe I did something wrong running the eval

@allenporter
Copy link
Copy Markdown
Owner

Can you share the command you are using for computing the metrics? I do believe the command is different for these tests, but not sure if this is still accurate: https://github.com/allenporter/home-assistant-datasets/tree/main/datasets/automations#evaluate

@NickM-27
Copy link
Copy Markdown
Contributor Author

NickM-27 commented May 2, 2026

I ran OUTPUT_DIR="reports/automations/2026.2.3" && pytest home_assistant_datasets/tool/assist/eval --model_output_dir=${OUTPUT_DIR}

I tried the commands from the automations readme but I got an error saying automation is not a valid key, and it should be leaderboard or something else

@allenporter
Copy link
Copy Markdown
Owner

How about running script/eval_metrics_automations.sh

@NickM-27 NickM-27 changed the title Debugging Gemma4 automations Add Gemma4 Automations Evaluation May 2, 2026
@NickM-27
Copy link
Copy Markdown
Contributor Author

NickM-27 commented May 2, 2026

That did it

@NickM-27 NickM-27 changed the title Add Gemma4 Automations Evaluation Add Gemma4 26B-A4B Automations Evaluation May 2, 2026
@allenporter allenporter merged commit ee1f095 into allenporter:main May 2, 2026
2 checks passed
@allenporter
Copy link
Copy Markdown
Owner

Very strong results, thank you.

@NickM-27 NickM-27 deleted the investigate-gemma4-auto branch May 2, 2026 17:23
@NickM-27
Copy link
Copy Markdown
Contributor Author

NickM-27 commented May 3, 2026

@allenporter by the way, this was with reasoning set to off, I didn't think to check if there was a way to designate that

@allenporter
Copy link
Copy Markdown
Owner

OK good call out. You could update models.yaml and with that detail just in the text description and it can then show up on the display https://github.com/allenporter/home-assistant-datasets/tree/main/reports#gemma4-26b-a4b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants