QQEval - Question Quality Evaluation Tool

This repository contains tools for evaluating question quality in human-AI interactions, specifically for the CogSci2025 conference presentation.

Introduction

This project evaluates the quality of questions in conversations based on predefined rubrics. It uses the Anthropic API to assess the effectiveness of these questions according to different contexts and goals, with a focus on cognitive science research.

Project Structure

eval_anth.py: Main evaluation script that processes sample conversations and applies the rubric
Rubric_GQ.json: Evaluation criteria for good follow-up questions
system_prompt.txt: System prompt template for Claude
_src/: Directory containing sample conversation data
_output/: Directory where evaluation results are stored
.env: Configuration for API keys (not included in repository)

Requirements

To run this evaluation tool, install the required dependencies:

pip install -r requirements.txt

Setup

Clone this repository
Install dependencies using the command above
Create a .env file in the root directory with your Anthropic API key:
```
ANTHROPIC_API_KEY=your_api_key_here
```
Note: The .env file is included in .gitignore and will not be uploaded to the repository for security reasons.

Usage

Run the evaluation script with:

python eval_anth.py

The script will:

Load sample conversations from the specified input file
Apply the evaluation rubric with the configured variables
Generate evaluations using Claude
Save results to the _output directory with a timestamp

Configuration

You can modify the following variables in eval_anth.py:

RUBRIC_VARIABLES: Customize context variables like "answerer" and "goal"
MODEL_NAME: Change the Claude model version
MODEL_TEMPERATURE: Adjust the randomness of Claude's responses
MAX_TOKENS: Set the maximum token length for responses

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

QQEval - Question Quality Evaluation Tool

Introduction

Project Structure

Requirements

Setup

Usage

Configuration

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
_output		_output
_src		_src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Rubric_GQ.json		Rubric_GQ.json
eval_anth.py		eval_anth.py
requirements.txt		requirements.txt
system_prompt.txt		system_prompt.txt

Folders and files

Latest commit

History

Repository files navigation

QQEval - Question Quality Evaluation Tool

Introduction

Project Structure

Requirements

Setup

Usage

Configuration

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages