diff --git a/docs/assets/jailbreak/20240207-jailbreaking-thinking_out_loud.png b/docs/assets/jailbreak/20240207-jailbreaking-thinking_out_loud.png new file mode 100644 index 00000000000..f6b6478bec6 Binary files /dev/null and b/docs/assets/jailbreak/20240207-jailbreaking-thinking_out_loud.png differ diff --git a/docs/prompt_hacking/jailbreaking.md b/docs/prompt_hacking/jailbreaking.md index 84004080d1a..59ec9df1335 100644 --- a/docs/prompt_hacking/jailbreaking.md +++ b/docs/prompt_hacking/jailbreaking.md @@ -39,6 +39,16 @@ import actor from '@site/docs/assets/jailbreak/chatgpt_actor.jpg'; This example by [@m1guelpf](https://twitter.com/m1guelpf/status/1598203861294252033) demonstrates an acting scenario between two people discussing a robbery, causing ChatGPT to assume the role of the character(@miguel2022jailbreak). As an actor, it is implied that plausible harm does not exist. Therefore, ChatGPT appears to assume it is safe to give follow provided user input about how to break into a house. +#### Thinking Aloud + +import actor from '@site/docs/assets/jailbreak/20240207-jailbreaking-thinking_out_loud.png'; + +