Skip to content

feat: Add FaithfulnessEvaluator#27

Merged
jjbuck merged 1 commit intostrands-agents:mainfrom
jjbuck:feature/faithfulness_evaluator
Nov 6, 2025
Merged

feat: Add FaithfulnessEvaluator#27
jjbuck merged 1 commit intostrands-agents:mainfrom
jjbuck:feature/faithfulness_evaluator

Conversation

@jjbuck
Copy link
Copy Markdown
Collaborator

@jjbuck jjbuck commented Nov 4, 2025

Description

  1. Add FaithfulnessEvaluator to assess whether agent responses conflict with conversation history
  2. Implement 5-level scoring system (Not At All, Not Generally, Neutral/Mixed, Generally Yes, Completely Yes
  3. Add faithfulness prompt templates (v0) with evaluation guidelines
  4. Include example usage and comprehensive unit tests
  5. Update naming of low-level data types (turn -> trace, conversation -> session) to align better with OTEL conventions.

Related Issues

  1. feat: add GoalSuccessRateEvaluator for conversation goal tracking #22 (goal success rate)
  2. Implement trace-based evaluation with helpfulness evaluator. #16 (helpfulness)

Documentation PR

N/A

Type of Change

  1. New feature ("Faithfulness Evaulator")
  2. Refactor Update naming of low-level data types (turn -> trace, conversation -> session) to align better with OTEL conventions.

Testing

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@jjbuck jjbuck requested a review from poshinchen November 4, 2025 17:22
@jjbuck jjbuck force-pushed the feature/faithfulness_evaluator branch from 91d6b18 to db136bf Compare November 5, 2025 22:12
@jjbuck
Copy link
Copy Markdown
Collaborator Author

jjbuck commented Nov 5, 2025

@poshinchen just rebased on the tip of main (which now includes #25).

- Add FaithfulnessEvaluator to assess whether agent responses conflict with conversation history
- Implement 5-level scoring system (Not At All, Not Generally, Neutral/Mixed, Generally Yes, Completely Yes)
- Add faithfulness prompt templates (v0) with evaluation guidelines
- Include example usage and comprehensive unit tests
- Update low-level data types (turn -> trace, conversation -> session)
@jjbuck jjbuck force-pushed the feature/faithfulness_evaluator branch from db136bf to d358a12 Compare November 6, 2025 19:00
@jjbuck jjbuck requested a review from poshinchen November 6, 2025 19:01
@jjbuck
Copy link
Copy Markdown
Collaborator Author

jjbuck commented Nov 6, 2025

@poshinchen addressed your comments above in the latest edits.

@jjbuck jjbuck merged commit 3cac0f0 into strands-agents:main Nov 6, 2025
12 checks passed
@smeetd159 smeetd159 mentioned this pull request Nov 18, 2025
7 tasks
@smeetd159 smeetd159 mentioned this pull request Dec 2, 2025
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants