🛡️ PromptShield AI 2.0

Real-time BERT-powered Prompt Injection Detection Firewall
Stop malicious prompt manipulation attempts before they hit your AI.

📌 About the Project

PromptShield AI is a real-time prompt injection detection system designed for LLM-driven banking, fintech, and chatbot applications.

It uses a custom-trained model on thousands of realistic, adversarial, and gray-zone prompts to detect:

Polite but malicious prompt injections
Multi-step logical attack prompts
Conversational "friendly-looking" bypass attempts
Obvious threats like OTP bypass, admin escalation, etc.

✅ Built using BERT embeddings
✅ Tuned with real-world thresholding logic
✅ Live deployed on HuggingFace Spaces

🚀 Live Demo

👉 Try it here: https://huggingface.co/spaces/rishit03/promptshield

⚙️ Features

✅ Real-time prediction (Safe / Injected)
✅ Confidence-based thresholding
✅ BERT embeddings via DistilBERT
✅ Noisy, adversarial, polite prompt detection
✅ Streamlit UI for public demo
✅ Explainability: keyword triggers shown
✅ HuggingFace Spaces compatible

📂 Folder Structure

promptshield/
├── app.py                   # Streamlit app entrypoint
├── models/
│   └── model.pkl            # Trained model
├── src/
│   ├── bert_features.py     # BERT encoder
│   ├── trainer.py           # Model trainer
│   └── __init__.py
├── data/
│   └── dataset.csv # Training data
├── requirements.txt
└── README.md

🧠 How to Run Locally

git clone https://github.com/rishit03/promptshield.git
cd promptshield

pip install -r requirements.txt

streamlit run app.py

🧪 Training

python -m src.trainer

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
__pycache__		__pycache__
data		data
models		models
src		src
tests		tests
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🛡️ PromptShield AI 2.0

📌 About the Project

🚀 Live Demo

⚙️ Features

📂 Folder Structure

🧠 How to Run Locally

🧪 Training

🛠️ Built With

📜 License

🙌 Author

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

rishit03/adv-prompt-injection-detector

Folders and files

Latest commit

History

Repository files navigation

🛡️ PromptShield AI 2.0

📌 About the Project

🚀 Live Demo

⚙️ Features

📂 Folder Structure

🧠 How to Run Locally

🧪 Training

🛠️ Built With

📜 License

🙌 Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages