Skip to content

xunguangwang/Awesome-Jailbreak-Guardrails

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 

Repository files navigation

Awesome Jailbreak Guardrails for Large Models

Introduction

This repository is a list of research papers, articles, and resources related to jailbreak guardrails for Large Models (i.e., large language models (LLMs), multimodal large language models (MLLMs), and AI agents). Jailbreak guardrails are techniques and strategies designed to detect and filter unauthorized or harmful behavior in AI systems, ensuring they operate safely and ethically.

Survey Papers

LLM's Jailbreak Guardrails

MLLM's Jailbreak Guardrails

Agents' Jailbreak Guardrails

Benchmarks/Datasets

Acknowledgement

About

A reading list of jailbreak guardrails for large models

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors