Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiclass llm eval #1529

Merged
merged 8 commits into from
Mar 28, 2025
Merged

multiclass llm eval #1529

merged 8 commits into from
Mar 28, 2025

Conversation

mike0sv
Copy link
Collaborator

@mike0sv mike0sv commented Mar 26, 2025

PR Description: Implement Multiclass Classification Descriptors

This PR introduces a new module for multiclass classification prompt templates and evaluations in the evidently library. It includes two evaluation classes: RelevanceLLMEval and BrandSafetyLLMEval, which leverage the newly created MulticlassClassificationPromptTemplate to classify inputs based on specified criteria.

Key Features:

  • Multiclass Classification Prompt Template: A structured way to create templates for classification tasks, allowing configuration of category criteria, uncertainty handling, and output formatting.
  • Evaluation Classes:
    • RelevanceLLMEval evaluates how well answers address questions.
    • BrandSafetyLLMEval assesses text compliance with brand safety guidelines.

Examples:

  1. Relevance Evaluation:

    relevance_eval = RelevanceLLMEval()
    template = relevance_eval.get_template()
    print(template.get_blocks())

    This evaluates how well an answer responds to a given question based on categories like "Irrelevant," "Partially Relevant," and "Relevant."

  2. Brand Safety Evaluation:

    brand_safety_eval = BrandSafetyLLMEval()
    template = brand_safety_eval.get_template()
    print(template.get_blocks())

    This assesses text compliance with brand safety and tone using categories like "Fully Compliant," "Partially Compliant," and "Incompliant."

This addition enhances the flexibility and capability of the evidently library in handling complex evaluation scenarios for machine learning models.

@emeli-dral emeli-dral merged commit 86e710f into main Mar 28, 2025
25 checks passed
@emeli-dral emeli-dral deleted the feature/multiclass-llm branch March 28, 2025 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants