Skip to content

Add LLM prompt injection detection rule (system prompt leakage)#1

Open
kamalkalwa wants to merge 1 commit intoEscape-Technologies:mainfrom
kamalkalwa:add/llm-prompt-injection-rule
Open

Add LLM prompt injection detection rule (system prompt leakage)#1
kamalkalwa wants to merge 1 commit intoEscape-Technologies:mainfrom
kamalkalwa:add/llm-prompt-injection-rule

Conversation

@kamalkalwa
Copy link
Copy Markdown

Adds a new vulnerability rule for detecting LLM system prompt leakage via prompt injection.

Location: vulnerabilities/prompt_injection/system_prompt_leakage.yaml

What it does:

  • Mutates string fields in API requests with prompt injection payloads (instruction override, debug mode tricks, conversation delimiter abuse)
  • Detects system prompt leakage in responses by matching against raw LLM template tokens (<<SYS>>, [INST], <|system|>), instruction disclosure phrases, and debug mode acknowledgments

Compliance mapping: OWASP LLM Top 10 - LLM01 (Prompt Injection), CWE-77

Why: With more APIs wrapping LLM backends, prompt injection is becoming one of the most common API security issues. Didn't see an existing rule covering this.

Adds a prompt injection rule under vulnerabilities/prompt_injection/
that tests string fields for system prompt extraction attacks.

The rule sends injection payloads via request.object mutations and
detects leakage by matching response text against patterns like raw
template tokens (<<SYS>>, [INST], <|system|>), instruction disclosure
phrases, and debug mode acknowledgments.

Maps to OWASP LLM Top 10 - LLM01 (Prompt Injection) and CWE-77.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant