Rated Ranking Evaluator Tools (RRE)

Overview

Dataset Generator
Embedding Model Evaluator
Approximate Search Evaluator

Dataset Generator (DAGE)

This tool provides a flexible command-line tool to generate relevance datasets for search evaluation. It can retrieve documents from a search engine, generate synthetic queries, and score the relevance of document-query pairs using LLMs.

Embedding Model Evaluator

This tool provide a flexible tool to test a HuggingFace embedding model to ensure that works as expected with exact vector search.

Approximate Search Evaluator

This tool provide a flexible tool to deply RRE and extract metrics to test your search engine collection given a template.

Quickstart: tools installation

uv: A fast Python package installer and resolver. To install uv follow the instructions here
Python=3.10 version is fixed and widely used in the project, see .python-version file

First, create a virtual environment using uv following the file pyproject.toml. To do so, just execute:

# place yourself in the rre-tools folder
cd rre-tools

# install dependencies (for users)
uv sync

# install development dependencies as well (e.g., mypy and ruff)
uv sync --group dev

Running Dataset Generator (DAGE)

Before running the command below, you need to have running search engine instance (solr/opensearch/elasticsearch/vespa). This can be done even with the test collections in folder docker-services.

For a detailed description to fill your configuration file (e.g., Config) you can look at the Dataset Generator README.

Execute the main script via CLI, pointing to your DAGE configuration file:

uv run dataset-generator --config <path-to-DAGE-config-yaml>

To know more about all the possible CLI parameters, execute:

uv run dataset-generator --help

Running Embedding Model Evaluator

For a detailed description to fill in configuration file (e.g., Config) you can look at the README.

Execute the main script via CLI, pointing to configuration file:

uv run embedding-model-evaluator --config <path-to-config-yaml>

Running tests

1. Unit Tests

Execute pytest command as follows:

uv run pytest

The script will then:

Fetch documents from the specified search engine.
Generate or load queries.
Score the relevance for each (document, query) pair.
Save the output to the destination (specified in the config file).

Code Quality Tools

Configuration Files

ruff.toml: Configures Ruff's linting rules and settings
mypy.ini: Configures Mypy's type checking settings

Type checker with mypy

To run mypy type checks inside the environment use

uv run mypy .

Code linter with ruff

To run ruff linter inside the environment use

uv run ruff check

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rated Ranking Evaluator Tools (RRE)

Overview

Dataset Generator (DAGE)

Embedding Model Evaluator

Approximate Search Evaluator

Quickstart: tools installation

Running Dataset Generator (DAGE)

Running Embedding Model Evaluator

Running tests

1. Unit Tests

Code Quality Tools

Configuration Files

Type checker with mypy

Code linter with ruff

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Rated Ranking Evaluator Tools (RRE)

Overview

Dataset Generator (DAGE)

Embedding Model Evaluator

Approximate Search Evaluator

Quickstart: tools installation

Running Dataset Generator (DAGE)

Running Embedding Model Evaluator

Running tests

1. Unit Tests

Code Quality Tools

Configuration Files

Type checker with mypy

Code linter with ruff