Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add correctness_based_on_ground_truth criteria #1623

Merged
merged 2 commits into from
Feb 23, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
{
"__type__": "criteria_with_options",
"name": "correctness_based_on_ground_truth",
"description": "Does the response correctly convey the same factual information as the ground truth?",
"options": [
{
"__type__": "criteria_option",
"name": "correct",
"description": "The response conveys the same factual meaning as the ground truth. Minor rewording, synonyms, or grammatical differences are acceptable. The response is relevant to the question and does not introduce unrelated or misleading information."
},
{
"__type__": "criteria_option",
"name": "partially_correct",
"description": "The response contains some correct information but is incomplete or lacks essential details. It may also contain minor inaccuracies or extraneous information that slightly misrepresents the ground truth."
},
{
"__type__": "criteria_option",
"name": "incorrect",
"description": "The response does not align with the ground truth. It either presents incorrect, unrelated, or misleading information, or omits key details that change the intended meaning."
}
],
"option_map": {
"correct": 1.0,
"partially_correct": 0.5,
"incorrect": 0.0
}
}
24 changes: 24 additions & 0 deletions src/unitxt/llm_as_judge_constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -934,6 +934,30 @@ class DirectCriteriaCatalogEnum(Enum):
},
)

CORRECTNESS_BASED_ON_GROUND_TRUTH = CriteriaWithOptions(
name="correctness_based_on_ground_truth",
description="Does the response correctly convey the same factual information as the ground truth?",
options=[
CriteriaOption(
name="correct",
description="The response conveys the same factual meaning as the ground truth. Minor rewording, synonyms, or grammatical differences are acceptable. The response is relevant to the question and does not introduce unrelated or misleading information.",
),
CriteriaOption(
name="partially_correct",
description="The response contains some correct information but is incomplete or lacks essential details. It may also contain minor inaccuracies or extraneous information that slightly misrepresents the ground truth.",
),
CriteriaOption(
name="incorrect",
description="The response does not align with the ground truth. It either presents incorrect, unrelated, or misleading information, or omits key details that change the intended meaning.",
),
],
option_map={
"correct": 1.0,
"partially_correct": 0.5,
"incorrect": 0.0,
},
)


DIRECT_CRITERIA = [c.value for c in DirectCriteriaCatalogEnum]

Expand Down
Loading