Skip to content

Add --metrics CLI flag to filter which metrics run#3

Merged
emac-E merged 1 commit into
emac-E:mainfrom
Lifto:feat/metrics-cli-filter
Apr 6, 2026
Merged

Add --metrics CLI flag to filter which metrics run#3
emac-E merged 1 commit into
emac-E:mainfrom
Lifto:feat/metrics-cli-filter

Conversation

@Lifto

@Lifto Lifto commented Apr 6, 2026

Copy link
Copy Markdown

Summary

  • Adds --metrics CLI argument to lightspeed-eval that filters each turn's turn_metrics to only the specified metrics
  • Example: --metrics custom:answer_correctness runs only correctness, skipping all RAGAS metrics
  • Works like --tags and --conv-ids — filters after loading, before validation

Motivation

Running the full metric suite (5 metrics per question) takes ~37 minutes. When we only need answer correctness for comparison runs, this drops to ~6 minutes without editing YAML configs.

Changes

  • runner/evaluation.py: Add --metrics arg to argparse, pass to load_evaluation_data
  • core/system/validator.py: Add metrics parameter, filter turn_metrics lists after scope filtering

Usage

# Run only answer correctness (skip RAGAS metrics)
lightspeed-eval --eval-data config/CLA_tests.yaml --metrics custom:answer_correctness

# Run two specific metrics
lightspeed-eval --eval-data config/CLA_tests.yaml --metrics custom:answer_correctness ragas:faithfulness

# No flag = run all metrics (existing behavior, unchanged)
lightspeed-eval --eval-data config/CLA_tests.yaml

@emac-E emac-E left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea, thanks!

Allows running a subset of configured metrics without editing YAML configs.
Example: --metrics custom:answer_correctness to skip RAGAS metrics.
@Lifto Lifto force-pushed the feat/metrics-cli-filter branch from b3da4ad to f05d54d Compare April 6, 2026 17:50
@emac-E emac-E merged commit 7aceea5 into emac-E:main Apr 6, 2026
5 of 15 checks passed
emac-E pushed a commit that referenced this pull request Apr 10, 2026
delete old scripts/evaluation, add README
emac-E added a commit that referenced this pull request Apr 10, 2026
Add --metrics CLI flag to filter which metrics run
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants