ProxyPrompt: Securing System Prompts against Prompt Extraction Attacks

🔥 Official implementation of "ProxyPrompt: Securing System Prompts against Prompt Extraction Attacks"

Overview

This repository contains the implementation of "ProxyPrompt: Securing System Prompts against Prompt Extraction Attacks," a novel defense mechanism that prevents prompt leakage by replacing the original prompt with a proxy.

Getting Started

Follow the steps below to set up the required Python environment and dependencies:

Create a Conda Environment from environment.yml
```
conda env create -f environment.yml
```
Activate the Environment
```
conda activate proxy_prompt
```

Configure API Keys

Create a .env file in the project root and add your tokens:

# Hugging Face Token (required)
HF_TOKEN=your_huggingface_token_here

# OpenAI Configuration (optional, for certain features)
OPENAI_ENDPOINT=your_endpoint_here
OPENAI_API_KEY=your_api_key_here
DEPLOYMENT_NAME=your_deployment_name

Usage

The project provides example scripts in the scripts/ folder to run the complete ProxyPrompt pipeline:

scripts/custom_prompt/ - Protect your own custom prompts with ProxyPrompt defense
scripts/paper_prompt/ - Reproduce the results from the paper

Each script contains a complete pipeline from data collection to evaluation.

Citation

If you find our work useful, please star this repo and cite:

@article{zhuang2025proxyprompt,
    title={ProxyPrompt: Securing System Prompts against Prompt Extraction Attacks},
    author={Zhuang, Zhixiong and Nicolae, Maria-Irina and Wang, Hui-Po and Fritz, Mario},
    journal={arXiv preprint arXiv:2505.11459},
    year={2025}
}

Data source

Attack data in attacks/5-gram-175-attacks.json sourced from:

Zhang et al. "Effective prompt extraction from language models." COLM 2024.
Hui et al. "Pleak: Prompt leaking attacks against large language model applications." CCS 2024.
Wang et al. "Raccoon: Prompt extraction benchmark of llm-integrated applications." ACL 2024.
Liang et al. "Why are my prompts leaked? Unraveling prompt extraction threats in customized large language models." arXiv:2408.02416, 2024.

See the paper for details on system prompt sources.

License

This project is open-sourced under the AGPL-3.0 license. See the LICENSE file for details.

For a list of other open source components included in this project, see the file 3rd-party-licenses.txt.

Purpose of the project

This software is a research prototype, solely developed for and published as part of the publication cited above.

Contact

Please feel free to open an issue or contact personally if you have questions, need help, or need explanations. Don't hesitate to write an email to the following email address: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
attacks		attacks
data		data
scripts		scripts
src		src
.gitignore		.gitignore
3rd-party-licenses.txt		3rd-party-licenses.txt
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
teaser.png		teaser.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ProxyPrompt: Securing System Prompts against Prompt Extraction Attacks

Table of Contents

Overview

Getting Started

Usage

Citation

Data source

License

Purpose of the project

Contact

About

Uh oh!

Languages

License

boschresearch/proxyprompt

Folders and files

Latest commit

History

Repository files navigation

ProxyPrompt: Securing System Prompts against Prompt Extraction Attacks

Table of Contents

Overview

Getting Started

Usage

Citation

Data source

License

Purpose of the project

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages