Skip to content

Agent Evaluation with LLM as Judge #31

@vedem1192

Description

@vedem1192

Description

Expend on agent evaluation with a notebook showing how to evaluate an agent response using LLM as Judge.

Once we have the foundation of what is LLMaJ and how it works, we should build from this and have a notebook on evaluating agent response with an Ensemble of Judges for more accurate judging.

Potential points to cover

  • What is LLM as Judge?
  • Why use it for agent response evaluation instead of other metrics?
  • Pros and cons of ensemble of judges

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions