multiclass llm eval #1529

mike0sv · 2025-03-26T16:24:50Z

PR Description: Implement Multiclass Classification Descriptors

This PR introduces a new module for multiclass classification prompt templates and evaluations in the evidently library. It includes two evaluation classes: RelevanceLLMEval and BrandSafetyLLMEval, which leverage the newly created MulticlassClassificationPromptTemplate to classify inputs based on specified criteria.

Key Features:

Multiclass Classification Prompt Template: A structured way to create templates for classification tasks, allowing configuration of category criteria, uncertainty handling, and output formatting.
Evaluation Classes:
- RelevanceLLMEval evaluates how well answers address questions.
- BrandSafetyLLMEval assesses text compliance with brand safety guidelines.

Examples:

Relevance Evaluation:
```
relevance_eval = RelevanceLLMEval()
template = relevance_eval.get_template()
print(template.get_blocks())
```
This evaluates how well an answer responds to a given question based on categories like "Irrelevant," "Partially Relevant," and "Relevant."
Brand Safety Evaluation:
```
brand_safety_eval = BrandSafetyLLMEval()
template = brand_safety_eval.get_template()
print(template.get_blocks())
```
This assesses text compliance with brand safety and tone using categories like "Fully Compliant," "Partially Compliant," and "Incompliant."

This addition enhances the flexibility and capability of the evidently library in handling complex evaluation scenarios for machine learning models.

mike0sv and others added 8 commits March 26, 2025 16:24

multiclass llm eval

f292ce8

fix tests

ab4273b

chatbot safety

4c3c376

move to llm_judge

6664412

move to llm_judge

ec65557

move template to reatures

890881c

exampke

f94a8b2

example of Multicalss template

e082f9c

emeli-dral merged commit 86e710f into main Mar 28, 2025
25 checks passed

emeli-dral deleted the feature/multiclass-llm branch March 28, 2025 16:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multiclass llm eval #1529

multiclass llm eval #1529

mike0sv commented Mar 26, 2025 •

edited

Loading

multiclass llm eval #1529

multiclass llm eval #1529

Conversation

mike0sv commented Mar 26, 2025 • edited Loading

PR Description: Implement Multiclass Classification Descriptors

Key Features:

Examples:

mike0sv commented Mar 26, 2025 •

edited

Loading