FEAT: CBT-Bench Dataset #888

sravan0446 · 2025-04-20T19:03:32Z

CBT-Bench Dataset Integration for PyRIT

This pull request introduces support for the CBT-Bench dataset in PyRIT. The CBT-Bench dataset is a benchmark designed to evaluate the alignment and therapeutic safety of Large Language Models (LLMs) in the context of Cognitive Behavioral Therapy (CBT). By integrating this dataset, PyRIT can analyze psychotherapy-related prompts and identify potential risks, including cognitive distortions, harmful self-beliefs, and other vulnerabilities.

Key Changes

Added a new function fetch_cbt_bench_dataset to fetch and process the dataset.
Mapped dataset fields (situation, core_belief_fine_grained) to PyRIT's SeedPrompt model.
Included default configuration (core_fine_seed) with support for other configurations.
Updated documentation and comments to explain the purpose and usage of the dataset.

Related Issue

Closes: #865
Tagging @romanlutz for review.

Tests and Documentation

Tests: Unit tests have not been added. They can be included in a subsequent commit within the same pull request if requested by the reviewer.
Documentation: Inline documentation has been updated. The README has not been updated yet but will be addressed if deemed necessary.
JupyText: Not applicable, as this contribution focuses on dataset integration rather than API or documentation changes.

References

CBT-Bench Dataset on Hugging Face

Citation

@article{zhang2024cbt,
  title={CBT-Bench: Evaluating Large Language Models on Assisting Cognitive Behavior Therapy},
  author={Zhang, Mian and Yang, Xianjun and Zhang, Xinlu and Labrum, Travis and Chiu, Jamie C and Eack, Shaun M and Fang, Fei and Wang, William Yang and Chen, Zhiyu Zoey},
  journal={arXiv preprint arXiv:2410.13218},
  year={2024}
}

sravan0446 · 2025-04-20T19:06:41Z

@microsoft-github-policy-service agree

romanlutz

Please also look at api.rst (I think that's the name) because it should list this dataset along with the others from the datasets module. There may be some missing there, so if you want you can compare and fill in the gaps. Thank you!

romanlutz · 2025-04-20T22:27:27Z

pyrit/datasets/cbt_bench_dataset.py

+    Fetch CBT-Bench examples for a specific configuration and create a SeedPromptDataset.
+
+    Args:
+        config_name (str): The configuration name to load (default is "core_fine_seed").


This should tell us about the other options and the type should be Literal[...] with the individual options listed.

pyrit/datasets/cbt_bench_dataset.py

romanlutz · 2025-04-20T22:29:26Z

pyrit/datasets/cbt_bench_dataset.py

+        SeedPrompt(
+            value=item["situation"],  # Use 'situation' as the main prompt text
+            data_type="text",
+            name=f"CBT-Bench-{item['id']}",


Would these overlap between the datasets (based on config value)? If so we need to distinguish

romanlutz · 2025-04-20T22:30:25Z

pyrit/datasets/cbt_bench_dataset.py

+            data_type="text",
+            name=f"CBT-Bench-{item['id']}",
+            dataset_name="CBT-Bench",
+            harm_categories=item.get("core_belief_fine_grained", []),


What are the values for this? I just want to make sure they make sense as harm categories

@romanlutz you can see them here FYI

But they look like this:

[ "I am powerless, weak, vulnerable", "I am needy", "I am out of control" ]

I am not certain if they exactly align with our harm categories

These ones certainly don't.

I would say these broadly fall under psycho-social harms. @jbolor21 was that what you thought when you added the item?

Yes! I was thinking broadly under psycho-social harms!

The labels they have are very specific to CBT and not quite aligned with our categories so just lumping them under psycho-social harms would probably make the most sense!

Co-authored-by: Roman Lutz <[email protected]>

romanlutz · 2025-04-25T10:17:04Z

pyrit/datasets/cbt_bench_dataset.py

+
+    seed_prompts = [
+        SeedPrompt(
+            value=item["situation"],  # Use 'situation' as the main prompt text


After looking at the dataset, I would say the situation + thoughts together should be the prompt. Let's ask @jbolor21 who suggested adding the dataset to chime in 🙂

Yes I think adding the thoughts is important to the situation since these two are what are extracted from the original text!

romanlutz · 2025-05-02T07:08:00Z

@sravan0446 are you still interested in addressing the issues? 🙂

sravan0446 mentioned this pull request Apr 20, 2025

FEAT: CBT-Bench Dataset #865

Open

romanlutz reviewed Apr 20, 2025

View reviewed changes

sravan0446 and others added 2 commits April 22, 2025 14:10

Added support for CBT-Bench dataset

8fb8a0a

Update pyrit/datasets/cbt_bench_dataset.py

3b76096

Co-authored-by: Roman Lutz <[email protected]>

sravan0446 force-pushed the main branch from 2dea4ba to 3b76096 Compare April 22, 2025 08:40

romanlutz reviewed Apr 25, 2025

View reviewed changes

romanlutz self-assigned this May 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FEAT: CBT-Bench Dataset #888

FEAT: CBT-Bench Dataset #888

sravan0446 commented Apr 20, 2025

Uh oh!

sravan0446 commented Apr 20, 2025

Uh oh!

romanlutz left a comment

Uh oh!

romanlutz Apr 20, 2025

Uh oh!

Uh oh!

romanlutz Apr 20, 2025

Uh oh!

romanlutz Apr 20, 2025

Uh oh!

bashirpartovi Apr 25, 2025

Uh oh!

romanlutz Apr 25, 2025

Uh oh!

romanlutz Apr 25, 2025

Uh oh!

jbolor21 Apr 25, 2025

Uh oh!

jbolor21 Apr 25, 2025

Uh oh!

romanlutz Apr 25, 2025

Uh oh!

jbolor21 Apr 25, 2025

Uh oh!

romanlutz commented May 2, 2025

Uh oh!

Uh oh!

FEAT: CBT-Bench Dataset #888

Are you sure you want to change the base?

FEAT: CBT-Bench Dataset #888

Conversation

sravan0446 commented Apr 20, 2025

CBT-Bench Dataset Integration for PyRIT

Key Changes

Related Issue

Tests and Documentation

References

Citation

Uh oh!

sravan0446 commented Apr 20, 2025

Uh oh!

romanlutz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

romanlutz commented May 2, 2025

Uh oh!

Uh oh!