The NeMo Request Guard only extracts and validates the LAST user message from the conversation, completely ignoring: - System/developer messages (can inject arbitrary instructions) - Tool response messages (can contain embedded prompt injection) - Earlier user messages in conversation history
Attack Vectors:
- System message injection: Malicious instructions in "role": "system" bypass guardrails entirely
- Tool response poisoning: Prompt injection embedded in "role": "tool" content forwarded without validation
Impact:
- Complete guardrail bypass via non-user message roles
- Multi-turn conversation exploits through unvalidated history
References: OWASP LLM Top 10 – LLM01 (Prompt Injection), LLM08 (Excessive Agency), CWE-20
The NeMo Request Guard only extracts and validates the LAST user message from the conversation, completely ignoring: - System/developer messages (can inject arbitrary instructions) - Tool response messages (can contain embedded prompt injection) - Earlier user messages in conversation history
Attack Vectors:
Impact:
References: OWASP LLM Top 10 – LLM01 (Prompt Injection), LLM08 (Excessive Agency), CWE-20