Skip to content

feat: added support for accruacy metrics on yes/no scenarios#13

Draft
MiguelAFH wants to merge 1 commit intomainfrom
accuracy
Draft

feat: added support for accruacy metrics on yes/no scenarios#13
MiguelAFH wants to merge 1 commit intomainfrom
accuracy

Conversation

@MiguelAFH
Copy link
Collaborator

This is still a draft PR, do not merge yet

This PR aims to add support to compute precision, recall and F1 score of the yes/no scenarios present in MedHELM.
Public benchmarks:

  • RaceBias

Gated benchmarks:

  • EHRSHOT
  • N2C2

Private Benchmarks (Results not included yet):

  • ADHD-Behavior
  • ADHD-MedEffects
  • MedConfInfo
  • PrivacyDetection
  • BMT-Status
  • HospiceReferral
  • ClinicReferral
  • CDI-QA

@MiguelAFH MiguelAFH requested a review from suhana13 August 13, 2025 03:14
Copy link
Collaborator

@suhana13 suhana13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants