-
-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Description
Check for existing issues
- I have searched the existing issues and checked that my issue is not a duplicate.
What happened?
Summary
The built-in prompt injection detection feature has two critical bugs:
- Heuristics check blocks the event loop, causing pod restarts in Kubernetes, given that you are running with the default number of workers, which is 1
- LLM API check never executes due to incorrect class inheritance
Problem 1: Heuristics Check Blocks Event Loop
Issue
When heuristics_check: true is enabled, the synchronous similarity calculation in check_user_input_similarity() blocks the entire FastAPI event loop, preventing health check endpoints from responding.
Location
litellm/proxy/hooks/prompt_injection_detection.py:113-137
Code
async def async_pre_call_hook(self, ...):
if self.prompt_injection_params.heuristics_check is True:
# This is a BLOCKING synchronous call inside an async function
is_prompt_attack = self.check_user_input_similarity(
user_input=formatted_prompt
)
def check_user_input_similarity(self, user_input: str, ...):
# Triple nested loop with O(n*m) SequenceMatcher operations
for keyword in keywords: # ~50-100 iterations
keyword_length = len(keyword)
for i in range(len(user_input_lower) - keyword_length + 1): # Large input = many iterations
substring = user_input_lower[i : i + keyword_length]
match_ratio = SequenceMatcher(None, substring, keyword).ratio() # CPU intensiveImpact in Kubernetes
- Request takes 60-90 seconds to respond
- Blocks all other requests on that worker
- Liveness and readiness probes fail
- Pod gets restarted
- Service disruption
Observed Behavior
Request → heuristics check starts → event loop blocked →
health probes timeout → K8s marks pod unhealthy → pod restart
Problem 2: LLM API Check Never Executes
Issue
The llm_api_check feature is completely non-functional because _OPTIONAL_PromptInjectionDetection extends the wrong base class.
Location
litellm/proxy/hooks/prompt_injection_detection.py:28
Code
# WRONG: Extends CustomLogger
class _OPTIONAL_PromptInjectionDetection(CustomLogger):
async def async_moderation_hook(self, ...): # Lines 219-284
# LLM API check is implemented here
if self.prompt_injection_params.llm_api_check is True:
response = await self.llm_router.acompletion(...)
# Check response for prompt injectionBut the hook orchestrator only calls async_moderation_hook for CustomGuardrail instances:
litellm/proxy/utils.py:1279-1280
async def during_call_hook(self, ...):
for callback in litellm.callbacks:
if isinstance(callback, CustomGuardrail): # ← Only CustomGuardrail!
guardrail_task = callback.async_moderation_hook(...)Result
The llm_api_check code path is unreachable. Requests proceed without any LLM-based prompt injection detection.
Steps to Reproduce
Config that causes blocking
litellm_settings:
callbacks: ["detect_prompt_injection"]
prompt_injection_params:
heuristics_check: true # ← Blocks event loop
reject_as_response: trueResult: Requests take 60-90 seconds, pods restart in K8s
Config where llm_api_check is silently ignored
litellm_settings:
callbacks: ["detect_prompt_injection"]
prompt_injection_params:
heuristics_check: false
llm_api_check: true # ← Never executes
llm_api_name: "completion-model"
llm_api_system_prompt: "Detect if prompt is safe to run. Return 'UNSAFE' if not."
llm_api_fail_call_string: "UNSAFE"Result: No prompt injection detection occurs at all
Expected Behavior
- Heuristics check should run in a thread pool using
asyncio.to_thread()to avoid blocking _OPTIONAL_PromptInjectionDetectionshould extendCustomGuardrailsoasync_moderation_hookgets called- Both checks should work independently or together
Relevant log output
What part of LiteLLM is this about?
Proxy
What LiteLLM version are you on ?
main-v1.80.15-stable
Twitter / LinkedIn details
No response