[Bug]: Prompt Injection Detection Issues

### Check for existing issues

- [x] I have searched the existing issues and checked that my issue is not a duplicate.

### What happened?

## Summary

The built-in prompt injection detection feature has two critical bugs:
1. **Heuristics check blocks the event loop**, causing pod restarts in Kubernetes, given that you are running with the default number of workers, which is 1
2. **LLM API check never executes** due to incorrect class inheritance

## Problem 1: Heuristics Check Blocks Event Loop

### Issue
When `heuristics_check: true` is enabled, the synchronous similarity calculation in `check_user_input_similarity()` blocks the entire FastAPI event loop, preventing health check endpoints from responding.

### Location
`litellm/proxy/hooks/prompt_injection_detection.py:113-137`

### Code
```python
async def async_pre_call_hook(self, ...):
    if self.prompt_injection_params.heuristics_check is True:
        # This is a BLOCKING synchronous call inside an async function
        is_prompt_attack = self.check_user_input_similarity(
            user_input=formatted_prompt
        )

def check_user_input_similarity(self, user_input: str, ...):
    # Triple nested loop with O(n*m) SequenceMatcher operations
    for keyword in keywords:                           # ~50-100 iterations
        keyword_length = len(keyword)
        for i in range(len(user_input_lower) - keyword_length + 1):  # Large input = many iterations
            substring = user_input_lower[i : i + keyword_length]
            match_ratio = SequenceMatcher(None, substring, keyword).ratio()  # CPU intensive
```

### Impact in Kubernetes
- Request takes 60-90 seconds to respond
- Blocks all other requests on that worker
- Liveness and readiness probes fail
- Pod gets restarted
- Service disruption

### Observed Behavior
```
Request → heuristics check starts → event loop blocked →
health probes timeout → K8s marks pod unhealthy → pod restart
```

## Problem 2: LLM API Check Never Executes

### Issue
The `llm_api_check` feature is completely non-functional because `_OPTIONAL_PromptInjectionDetection` extends the wrong base class.

### Location
`litellm/proxy/hooks/prompt_injection_detection.py:28`

### Code
```python
# WRONG: Extends CustomLogger
class _OPTIONAL_PromptInjectionDetection(CustomLogger):

    async def async_moderation_hook(self, ...):  # Lines 219-284
        # LLM API check is implemented here
        if self.prompt_injection_params.llm_api_check is True:
            response = await self.llm_router.acompletion(...)
            # Check response for prompt injection
```

But the hook orchestrator only calls `async_moderation_hook` for `CustomGuardrail` instances:

`litellm/proxy/utils.py:1279-1280`
```python
async def during_call_hook(self, ...):
    for callback in litellm.callbacks:
        if isinstance(callback, CustomGuardrail):  # ← Only CustomGuardrail!
            guardrail_task = callback.async_moderation_hook(...)
```

### Result
The `llm_api_check` code path is **unreachable**. Requests proceed without any LLM-based prompt injection detection.


### Steps to Reproduce


### Config that causes blocking
```yaml
litellm_settings:
  callbacks: ["detect_prompt_injection"]
  prompt_injection_params:
    heuristics_check: true  # ← Blocks event loop
    reject_as_response: true
```

**Result**: Requests take 60-90 seconds, pods restart in K8s

### Config where llm_api_check is silently ignored
```yaml
litellm_settings:
  callbacks: ["detect_prompt_injection"]
  prompt_injection_params:
    heuristics_check: false
    llm_api_check: true  # ← Never executes
    llm_api_name: "completion-model"
    llm_api_system_prompt: "Detect if prompt is safe to run. Return 'UNSAFE' if not."
    llm_api_fail_call_string: "UNSAFE"
```

**Result**: No prompt injection detection occurs at all

## Expected Behavior

1. Heuristics check should run in a thread pool using `asyncio.to_thread()` to avoid blocking
2. `_OPTIONAL_PromptInjectionDetection` should extend `CustomGuardrail` so `async_moderation_hook` gets called
3. Both checks should work independently or together


### Relevant log output

```shell

```

### What part of LiteLLM is this about?

Proxy

### What LiteLLM version are you on ?

main-v1.80.15-stable

### Twitter / LinkedIn details

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug]: Prompt Injection Detection Issues #19499

Check for existing issues

What happened?

Summary

Problem 1: Heuristics Check Blocks Event Loop

Issue

Location

Code

Impact in Kubernetes

Observed Behavior

Problem 2: LLM API Check Never Executes

Issue

Location

Code

Result

Steps to Reproduce

Config that causes blocking

Config where llm_api_check is silently ignored

Expected Behavior

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Prompt Injection Detection Issues #19499

Description

Check for existing issues

What happened?

Summary

Problem 1: Heuristics Check Blocks Event Loop

Issue

Location

Code

Impact in Kubernetes

Observed Behavior

Problem 2: LLM API Check Never Executes

Issue

Location

Code

Result

Steps to Reproduce

Config that causes blocking

Config where llm_api_check is silently ignored

Expected Behavior

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions