LLMTrace uses an ensemble of ML models alongside regex-based pattern matching for prompt injection detection. Each model can be independently enabled or disabled in the configuration.
Model: protectai/deberta-v3-base-prompt-injection-v2
Purpose: Primary ML detector for prompt injection classification
Architecture: DeBERTa v3 base fine-tuned on prompt injection datasets
License: Apache 2.0
Input: Raw text (role prefixes stripped before analysis)
Output: Binary classification (injection / benign) with confidence score
Memory: ~500MB
security_analysis:
ml_enabled: true
ml_model: "protectai/deberta-v3-base-prompt-injection-v2"
ml_threshold: 0.8 # confidence threshold (0.0-1.0)
ml_cache_dir: "/root/.cache/huggingface/hub"
ml_preload: true # download model at startup
ml_download_timeout_seconds: 600Model: leolee99/InjecGuard
Purpose: Injection detection with reduced over-defence on benign security-related text
Specialization: Better at distinguishing security research terminology from actual attacks
License: See model card
Memory: ~500MB
Token limit: max_position_embeddings from the model's config.json (default 512). Inputs longer than this are truncated before inference.
security_analysis:
injecguard_enabled: true
injecguard_model: "leolee99/InjecGuard"
injecguard_threshold: 0.85Model: leolee99/PIGuard
Purpose: Prompt injection guard with a focus on indirect injection detection
Specialization: Indirect prompt injection embedded in data contexts
License: See model card
Memory: ~500MB
Token limit: Inherits InjecGuard's truncation (PIGuard delegates to the InjecGuard inference pipeline).
security_analysis:
piguard_enabled: true
piguard_model: "leolee99/PIGuard"
piguard_threshold: 0.85Purpose: Detects jailbreak attempts (DAN-style, role-play bypasses, etc.)
Implementation: Shares the DeBERTa inference pipeline with jailbreak-specific classification
Memory: Shares model weights with DeBERTa
security_analysis:
jailbreak_enabled: true
jailbreak_threshold: 0.7Models are downloaded from Hugging Face Hub on first use. The download location is controlled by ml_cache_dir:
security_analysis:
ml_cache_dir: "/root/.cache/huggingface/hub"
ml_preload: true
ml_download_timeout_seconds: 600ml_preload: true: (recommended): downloads all enabled models at proxy startup. The proxy blocks until models are ready.
ml_preload: false: downloads models on first request. The first analysis will be slower.
For environments without internet access:
-
Download models on a machine with access:
pip install huggingface_hub huggingface-cli download protectai/deberta-v3-base-prompt-injection-v2 huggingface-cli download leolee99/InjecGuard huggingface-cli download leolee99/PIGuard
-
Copy the cache directory to the target machine.
-
Set
ml_cache_dirto the copied path andml_preload: true.
With all models enabled, expect approximately 1.5-1.75 GB of RAM for model weights. The regex-only path uses negligible additional memory.
| Model | Approximate Size |
|---|---|
| DeBERTa v3 + jailbreak | ~500 MB |
| InjecGuard | ~500 MB |
| PIGuard | ~500 MB |
| Regex patterns | < 1 MB |
Each model is independently toggled. A minimal config with only regex + DeBERTa:
security_analysis:
ml_enabled: true
ml_threshold: 0.8
jailbreak_enabled: true
jailbreak_threshold: 0.7
injecguard_enabled: false
piguard_enabled: falseTo run regex-only (no ML overhead):
security_analysis:
ml_enabled: false
jailbreak_enabled: false
injecguard_enabled: false
piguard_enabled: falseML models require the ml feature flag at compile time:
cargo build --release --features mlWithout --features ml, the proxy runs regex-only detection regardless of config settings.