Skip to content

chujiezheng/LLM-Safeguard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

LLM-Safeguard

Official repository for our ICML 2024 paper "On Prompt-Driven Safeguarding for Large Language Models"

If you find this repository useful or our work is related to your research, please kindly cite it:

@inproceedings{
  llm-safeguard,
  title={On Prompt-Driven Safeguarding for Large Language Models},
  author={Chujie Zheng and Fan Yin and Hao Zhou and Fandong Meng and Jie Zhou and Kai-Wei Chang and Minlie Huang and Nanyun Peng},
  booktitle={The Forty-First International Conference on Machine Learning},
  year={2024}
}

If you find the chat templates used in this project useful, please also kindly cite it:

@misc{zheng-2024-chat-templates,
  author = {Zheng, Chujie},
  title = {Chat Templates for HuggingFace Large Language Models},
  year = {2024},
  howpublished = {\url{https://github.com/chujiezheng/chat_templates}}
}

Code and Data

See code for the experimental code for reproducing our experimental results.

We also release the experimental data and results in another repo: https://github.com/chujiezheng/LLM-Safeguard_data

About

Official repository for ICML 2024 paper "On Prompt-Driven Safeguarding for Large Language Models"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published