Skip to content

fix(presidio): PII token round-trip masking/unmasking#24291

Open
firestaerter3 wants to merge 6 commits intoBerriAI:mainfrom
firestaerter3:fix/presidio-pii-roundtrip-v2
Open

fix(presidio): PII token round-trip masking/unmasking#24291
firestaerter3 wants to merge 6 commits intoBerriAI:mainfrom
firestaerter3:fix/presidio-pii-roundtrip-v2

Conversation

@firestaerter3
Copy link

Problem

Presidio PII masking works on the input path but original values are never restored in responses. This affects both streaming and non-streaming Anthropic native API calls (/v1/messages).

Part of #22821. Supersedes #23037.
Depends on #24290 (fix/guardrail-request-data-v2).

Root causes fixed

  1. Position bug in anonymize_textpii_tokens originals were read from new_text[start:end] where start/end come from redacted_text["items"] (anonymized coordinate space). After the first replacement, positions shift and subsequent slices extract wrong characters. Fix: use analyze_results coordinates (original text space) with right-to-left sorting.

  2. Lost tokens across instances — LiteLLM creates two _OPTIONAL_PresidioPIIMasking instances per request (pre_call + post_call). self.pii_tokens on the masking instance is invisible to the unmasking instance. Fix: copy pii_tokens to request_data after masking.

  3. apply_guardrail always masks — no branch for unmasking when called with input_type="response". Fix: add early-return unmask path.

  4. Native Anthropic dict responses not handledasync_post_call_success_hook only handled ModelResponse. Anthropic native responses are TypedDicts with type:"message". Fix: add elif branch.

  5. Native Anthropic SSE bytes not handled — streaming hook only handled ModelResponseStream. Anthropic native streaming arrives as raw bytes. Fix: add carry-buffer SSE streaming unmask that preserves real-time streaming behavior.

  6. OpenAI-converted tools overwrite native formatprocess_input_messages() converts tools to OpenAI type:"function" format; returning them from apply_guardrail overwrites native Anthropic tools. Fix: strip tools, structured_messages, model, images before returning.

Changes (all in presidio.py)

  • anonymize_text: fix position bug using right-to-left sorted analyze_results; UUID suffix inside angle brackets; store in self.pii_tokens
  • apply_guardrail: add unmask path for input_type="response"; strip OpenAI-converted keys; copy pii_tokens to request_data
  • async_post_call_success_hook: prefer pii_tokens from request_data; add Anthropic native dict branch
  • _unmask_pii_text: add fallback for stripped angle brackets
  • async_post_call_streaming_iterator_hook: peek-based format detection; carry-buffer SSE streaming for Anthropic native bytes; existing OpenAI path preserved
  • _process_response_for_pii / _process_anthropic_response_for_pii: fallback to self.pii_tokens

Test plan

New test file tests/guardrails_tests/test_presidio_pii_roundtrip.py (13 tests):

  • _unmask_pii_text exact match replacement
  • _unmask_pii_text multiple tokens
  • _unmask_pii_text stripped angle brackets (FALLBACK 1)
  • _unmask_pii_text truncated token at end of text (FALLBACK 2)
  • _unmask_pii_text no-match passthrough
  • apply_guardrail unmask path with pii_tokens
  • apply_guardrail unmask no-op when pii_tokens empty
  • apply_guardrail strips OpenAI-converted keys after masking
  • apply_guardrail stores pii_tokens in request_data
  • async_post_call_success_hook unmasks Anthropic native dict
  • Streaming SSE: PII tokens in text_delta events are unmasked
  • Streaming SSE: split token across events reassembled via carry buffer
  • Streaming SSE: passthrough when no pii_tokens

All 13 new tests pass.

… handlers

Add request_data parameter to process_output_response() and
process_output_streaming_response() across the guardrail pipeline so
that guardrails can correlate input-phase state (e.g. PII tokens from
masking) with output-phase processing (e.g. unmasking).

Changes:
- BaseTranslation: declare request_data on both abstract methods
- UnifiedLLMGuardrails: forward data dict as request_data at all call sites
- All 13 concrete handlers: accept and merge request_data
- Existing unified guardrail tests updated for new signatures

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@vercel
Copy link

vercel bot commented Mar 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Mar 22, 2026 7:38am

Request Review

@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@codspeed-hq
Copy link
Contributor

codspeed-hq bot commented Mar 21, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing firestaerter3:fix/presidio-pii-roundtrip-v2 (2d5aa9a) with main (c89496f)

Open in CodSpeed

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 21, 2026

Greptile Summary

This PR fixes a long-standing Presidio PII round-trip masking/unmasking failure for both streaming and non-streaming Anthropic native API calls. It addresses six root causes: an offset bug in anonymize_text that extracted wrong original values, lost tokens between the pre-call and post-call guardrail instances, missing unmask branches for input_type="response", unhandled Anthropic native dict and SSE bytes response types, and OpenAI-converted tool keys overwriting native Anthropic format. A parallel change (fix/guardrail-request-data-v2) threads request_data through all BaseTranslation handler process_output_response signatures so PII tokens survive to the unmasking phase.

Key changes:

  • anonymize_text: tokens now stored in request_data["metadata"]["pii_tokens"] with UUID-suffixed keys; processes anonymized-text coordinates right-to-left to avoid shift errors
  • _unmask_pii_text: adds word-boundary-anchored FALLBACK 1 (stripped angle brackets) and retains FALLBACK 2 (truncated suffix)
  • async_post_call_success_hook: new elif branch handles Anthropic native dict (type: "message") responses
  • async_post_call_streaming_iterator_hook: peek-based format detection; new carry-buffer SSE path for Anthropic native bytes chunks; fixes litellm.output_parse_pii global flag check
  • apply_guardrail: early-return unmask path for input_type="response"; strips OpenAI-converted keys (tools, structured_messages, model, images) after masking
  • All BaseTranslation subclasses updated to accept and forward request_data

Issues found:

  • import json is duplicated as import json as _json on the very next line — the alias is redundant
  • The previous diagnostic warning for request_data is None in anonymize_text was silently dropped; PII tokens are now masked but never stored (and therefore never unmasked) with no log output when this edge case occurs
  • carry_data is declared Optional[dict] but dereferenced without a None-guard at the carry-flush sites (lines 1401, 1440); harmless today since carry_data/carry_lines are always set together, but fragile under future refactors
  • Several tests assign guardrail.pii_tokens = … which is dead code — the production hook no longer reads from self.pii_tokens

Confidence Score: 2/5

  • This PR should not be merged until the architectural issues raised in previous review threads are addressed alongside the new issues found here.
  • The PR correctly fixes several root causes and includes 13 new mock-only tests. However, multiple substantive concerns were raised in previous review rounds and have not been fully resolved: the entity-type pairing assumption, the per-instance vs per-request token safety in fallback paths, the apply_guardrail unmask gate ignoring the global litellm.output_parse_pii flag, the fallback _token_value storing the anonymized placeholder instead of the original PII, and the streaming path's carry-buffer carry_data/carry_data_idx NoneType fragility. Additionally the silent removal of the request_data is None diagnostic warning is a new regression found in this review. The duplicate import json alias is minor but indicates the code was not carefully cleaned up before submission.
  • litellm/proxy/guardrails/guardrail_hooks/presidio.py requires the most attention — specifically anonymize_text (token storage fallback), _unmask_pii_text (FALLBACK 1 false-positive risk), apply_guardrail (global flag gate), and the SSE carry-buffer carry_data NoneType guards.

Important Files Changed

Filename Overview
litellm/proxy/guardrails/guardrail_hooks/presidio.py Core PII round-trip fix: corrects position bug in anonymize_text, adds UUID-suffixed tokens, adds Anthropic native dict/SSE unmasking paths, and carry-buffer streaming — but silent token loss when request_data is None, duplicate import, and Optional[dict] NoneType concerns remain.
litellm/llms/base_llm/guardrail_translation/base_translation.py Adds request_data: Optional[dict] = None to both process_output_response and process_output_streaming_response abstract signatures; straightforward interface extension, no issues.
litellm/llms/anthropic/chat/guardrail_translation/handler.py Correctly threads request_data through to apply_guardrail by merging via local_request_data = {**(request_data or {}), "response": response}; preserves caller's pii_tokens without shadowing them.
litellm/llms/openai/chat/guardrail_translation/handler.py Same local_request_data merge pattern applied consistently for both non-streaming and streaming paths; clean and correct.
litellm/llms/pass_through/guardrail_translation/handler.py Previous code used the response dict directly as request_data, potentially overwriting caller metadata; fixed by always nesting response under "response" key in the merge.
litellm/proxy/guardrails/guardrail_hooks/unified_guardrail/unified_guardrail.py Passes request_data=data to all three process_output_response / process_output_streaming_response call sites; straightforward plumbing, no issues.
tests/guardrails_tests/test_presidio_pii_roundtrip.py 13 new mock-only tests covering unmask paths, SSE carry-buffer, and stripped-bracket fallback; tests pass but several guardrail.pii_tokens assignments are dead code that could mislead readers.
tests/guardrails_tests/test_guardrail_request_data_passthrough.py New signature-conformance and behavioural tests verify request_data propagation across all handler implementations; uses importlib and AsyncMock with no real network calls.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Proxy
    participant Presidio
    participant LLM
    participant PostHook

    Client->>Proxy: "POST /v1/messages with PII in prompt"
    Proxy->>Presidio: "anonymize_text(text, request_data)"
    Note over Presidio: "UUID-suffix tokens stored in request_data[metadata][pii_tokens]"
    Presidio-->>Proxy: "masked text e.g. PHONE_NUMBER_uuid"
    Proxy->>LLM: "request with masked prompt"
    LLM-->>Proxy: "response echoing masked tokens"

    alt "Non-streaming ModelResponse"
        Proxy->>PostHook: "async_post_call_success_hook"
        PostHook->>PostHook: "_process_response_for_pii mode=unmask"
        PostHook-->>Proxy: "all choices unmasked"
    else "Non-streaming Anthropic native dict"
        Proxy->>PostHook: "async_post_call_success_hook"
        PostHook->>PostHook: "_unmask_pii_text per content block"
        PostHook-->>Proxy: "unmasked dict response"
    else "Streaming Anthropic native SSE bytes"
        Proxy->>PostHook: "async_post_call_streaming_iterator_hook"
        Note over PostHook: "peek first chunk = bytes, carry-buffer SSE unmask"
        loop "SSE text_delta events"
            PostHook->>PostHook: "_unmask_pii_text combined buffer"
            PostHook-->>Client: "flush unmasked text_delta"
        end
    else "Streaming OpenAI format"
        Proxy->>PostHook: "async_post_call_streaming_iterator_hook"
        PostHook->>PostHook: "buffer all chunks then _stream_pii_unmasking"
        PostHook-->>Client: "reassembled unmasked chunks"
    end

    Proxy-->>Client: "response with original PII restored"
Loading

Last reviewed commit: "fix(presidio): compr..."

Comment on lines 935 to 953
if isinstance(response, ModelResponse) and not isinstance(
response.choices[0], StreamingChoices
): # /chat/completions requests
await self._process_response_for_pii(
response=response,
request_data=data,
mode="unmask",
)
elif self._is_anthropic_message_response(response):
await self._process_anthropic_response_for_pii(
response=cast(dict, response), request_data=data, mode="unmask"
)
if isinstance(response.choices[0].message.content, str):
verbose_proxy_logger.debug(
f"pii_tokens for unmask: {_pii_tokens}; initial response: {response.choices[0].message.content}"
)
response.choices[0].message.content = self._unmask_pii_text(
response.choices[0].message.content, _pii_tokens
)
elif (
isinstance(response, dict)
and response.get("type") == "message"
and response.get("role") == "assistant"
): # Anthropic native /v1/messages response
for block in response.get("content") or []:
if isinstance(block, dict) and block.get("type") == "text":
block["text"] = self._unmask_pii_text(block.get("text", ""), _pii_tokens)
return response
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 PII unmasking drops multi-choice, list-content, and tool-call handling

The new async_post_call_success_hook for ModelResponse only processes response.choices[0].message.content when it is a plain str. The previous implementation delegated to _process_response_for_pii, which handled:

  • All choices (not just index 0)
  • content as a list of content-blocks (e.g. multi-modal or structured outputs)
  • tool_calls arguments on every choice
  • Legacy function_call arguments

Any response that carries PII in choices[1+], in a list-typed content, or inside tool-call arguments will silently skip unmasking with this change.

_process_response_for_pii already exists and still correctly handles all these cases. Consider restoring its use here:

if isinstance(response, ModelResponse) and not isinstance(
    response.choices[0], StreamingChoices
):
    await self._process_response_for_pii(
        response=response,
        request_data=data,
        mode="unmask",
    )
elif (
    isinstance(response, dict)
    and response.get("type") == "message"
    and response.get("role") == "assistant"
):
    await self._process_anthropic_response_for_pii(
        response=cast(dict, response), request_data=data, mode="unmask"
    )
return response

_orig_val = text[_s:_e]
_ar_used.add(_ar_i)
break
self.pii_tokens[replacement] = _orig_val if _orig_val is not None else new_text[start:end]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 self.pii_tokens is unsafe for concurrent requests

anonymize_text now writes PII tokens to self.pii_tokens (the guardrail instance attribute) instead of to the per-request request_data. If the guardrail is registered as a singleton (the standard configuration), every concurrent request writes to the same dict, causing token maps to be mixed across requests: request A could unmask with tokens from request B.

Previously, tokens were stored in request_data["metadata"]["pii_tokens"], making them strictly per-request. The move to self.pii_tokens re-introduces a shared-state race condition.

The fix that copies to request_data in apply_guardrail is correct, but only works for the apply_guardrail path. The fallback or self.pii_tokens in _process_response_for_pii (line 1052-1054) and _process_anthropic_response_for_pii (line 1007-1010) still reads from this shared instance dict:

pii_tokens = (
    (request_data.get("pii_tokens") if request_data else None)
    or self.pii_tokens   # ← shared across all concurrent requests
)

A safer approach would be to pass request_data all the way through check_piianonymize_text and store directly in the per-request dict there (as the old code did), or at minimum clear self.pii_tokens only within a request-scoped context.

Comment on lines +972 to +976
# FALLBACK 1: LLM stripped angle brackets (e.g. <PERSON_abc> -> PERSON_abc)
stripped = token.strip("<>")
if stripped and stripped in text:
text = text.replace(stripped, original_text)
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 FALLBACK 1 can produce false-positive replacements

The stripped fallback removes both angle brackets from the stored token key and does a plain str.replace across the entire text:

stripped = token.strip("<>")       # e.g.  "PERSON_abc123"
if stripped and stripped in text:
    text = text.replace(stripped, original_text)

Because the UUID suffix is only 12 hex characters (e.g. PERSON_1a2b3c4d5e6f), there is a non-trivial chance that this identifier appears verbatim in a technical response (log output, JSON keys, documentation about entity types, etc.), causing an unintended replacement. At minimum, the overlap check should require a non-whitespace boundary on both sides of the match, or the fallback should only fire when the stripped form still begins with an uppercase letter sequence followed by an underscore (i.e. a recognisable Presidio entity-type prefix).

Comment on lines +533 to +540
for _ar_i, _ar in enumerate(_sorted_ar):
if _ar_i not in _ar_used and _ar_val(_ar, "entity_type") == item.get("entity_type"):
_s = _ar_val(_ar, "start")
_e = _ar_val(_ar, "end")
if _s is not None and _e is not None:
_orig_val = text[_s:_e]
_ar_used.add(_ar_i)
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Entity-type matching may mis-pair tokens when multiple entities share a type

The code pairs _sorted_items (anonymized-text positions) with _sorted_ar (original-text positions) by matching entity_type:

if _ar_i not in _ar_used and _ar_val(_ar, "entity_type") == item.get("entity_type"):

Both lists are sorted right-to-left by their respective start offsets, so right-most PHONE_NUMBER in the anonymized text will pair with right-most PHONE_NUMBER in the original analysis. This works when Presidio preserves entity order, but Presidio does not document a guaranteed positional correspondence between the analyzer_results list and the anonymizer items list. If two entities of the same type exist but Presidio returns them in different relative orderings in the two responses, the original values will be stored under the wrong tokens, silently corrupting the unmasking map.

Comment on lines +1482 to +1491
if input_type == "response" and self.output_parse_pii:
pii_tokens = request_data.get("pii_tokens", {}) if request_data else {}
if pii_tokens:
_texts = inputs.get("texts", [])
inputs["texts"] = [self._unmask_pii_text(t, pii_tokens) for t in _texts]
else:
verbose_proxy_logger.debug(
"apply_guardrail: no pii_tokens in request_data for output unmask path"
)
return inputs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 apply_guardrail unmask path silently no-ops when output_parse_pii=False

The early-return unmask path is gated on self.output_parse_pii:

if input_type == "response" and self.output_parse_pii:
    ...
    return inputs

If a caller sets apply_to_output=True (masked output mode, not unmask mode) but also happens to pass input_type="response", this gate fails silently and falls through to the masking path below. This isn't a bug in the current code because apply_to_output and output_parse_pii are separate features, but a brief comment here clarifying "unmask path only — apply_to_output responses are handled by async_post_call_*_hook" would prevent future regressions when new contributors read this method.

firestaerter3 added a commit to firestaerter3/litellm that referenced this pull request Mar 21, 2026
- Restore _process_response_for_pii for multi-choice/tool-call/list-content
  unmasking instead of simplified inline code (P1)
- Store pii_tokens in request_data inside anonymize_text for thread safety,
  fixing shared-instance race condition on concurrent requests (P1)
- Add word-boundary anchoring to FALLBACK 1 to prevent false positives (P2)
- Add clarifying comments for entity ordering and unmask path gate (P2)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment on lines 1301 to 1304
if not (self.output_parse_pii and pii_tokens):
async for chunk in response:
yield chunk
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Streaming path ignores global litellm.output_parse_pii flag

The streaming unmask gate at line 1301 only checks self.output_parse_pii, while the non-streaming path at line 937 checks both self.output_parse_pii and litellm.output_parse_pii:

# non-streaming (line 937) — skips ONLY when both are False
if self.output_parse_pii is False and litellm.output_parse_pii is False:
    return response

# streaming (line 1301) — skips when instance flag is False, ignoring the global
if not (self.output_parse_pii and pii_tokens):
    ...
    return

A user who sets litellm.output_parse_pii = True globally (without per-guardrail config) will see PII unmasked in non-streaming responses but not in streaming responses. The fix is to mirror the non-streaming logic here:

Suggested change
if not (self.output_parse_pii and pii_tokens):
async for chunk in response:
yield chunk
return
if not ((self.output_parse_pii or litellm.output_parse_pii) and pii_tokens):
async for chunk in response:
yield chunk
return

Comment on lines +1526 to +1527
if request_data is not None and self.pii_tokens:
request_data["pii_tokens"] = dict(self.pii_tokens)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 apply_guardrail overwrites per-request tokens with the full accumulated instance dict

anonymize_text already writes each token directly to request_data["pii_tokens"] as it processes entities (lines 543–546), so request_data already holds exactly the right tokens for this request.

Line 1527 then replaces that correct per-request dict with dict(self.pii_tokens). In a singleton deployment (standard), self.pii_tokens accumulates tokens from every request ever processed by this instance, so the overwrite injects tokens from other requests into request_data["pii_tokens"]. This means the post-call hook may attempt to un-mask identifiers it never masked, producing incorrect text in the response.

# The incremental write in anonymize_text already produces the right per-request map.
# This line replaces it with the entire singleton history:
request_data["pii_tokens"] = dict(self.pii_tokens)   # ← leaks cross-request tokens

A safer approach is to merge only the tokens that were produced during this call (i.e., trust what anonymize_text already wrote) rather than overwriting with the full accumulated dict:

# Only backfill if anonymize_text didn't already populate request_data
if request_data is not None and self.pii_tokens:
    if not request_data.get("pii_tokens"):
        request_data["pii_tokens"] = dict(self.pii_tokens)

Comment on lines +980 to +986
stripped = token.strip("<>")
if stripped:
import re
_fb_pattern = r'(?<!\w)' + re.escape(stripped) + r'(?!\w)'
if re.search(_fb_pattern, text):
text = re.sub(_fb_pattern, original_text, text)
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 import re placed inside a hot loop

_unmask_pii_text is called on every streaming SSE chunk and every response post-processing. The import re statement is inside the for token loop, so it executes on every loop iteration that reaches FALLBACK 1. Although Python's import machinery caches sys.modules['re'] after the first load, the dictionary lookup still happens on each call. Move the import to module-level (or at minimum to function scope, outside the loop):

Suggested change
stripped = token.strip("<>")
if stripped:
import re
_fb_pattern = r'(?<!\w)' + re.escape(stripped) + r'(?!\w)'
if re.search(_fb_pattern, text):
text = re.sub(_fb_pattern, original_text, text)
continue
# FALLBACK 1: LLM stripped angle brackets (e.g. <PERSON_abc> -> PERSON_abc)
# Use word-boundary anchoring to avoid false positives in technical text
stripped = token.strip("<>")
if stripped:
_fb_pattern = r'(?<!\w)' + re.escape(stripped) + r'(?!\w)'
if re.search(_fb_pattern, text):
text = re.sub(_fb_pattern, original_text, text)
continue

And add import re at the top of the module (it's already used via re.escape/re.search/re.sub, so hoisting the import is straightforward).

Comment on lines +1390 to +1404
lt_pos = unmasked.rfind("<")
if lt_pos < 0 or (len(unmasked) - lt_pos) > MAX_CARRY:
flush_text = unmasked
carry_text = ""
carry_lines = None
else:
flush_text = unmasked[:lt_pos]
carry_text = unmasked[lt_pos:]
carry_lines = lines
carry_data = data
carry_data_idx = data_idx

data["delta"]["text"] = flush_text
lines[data_idx] = "data: " + _json.dumps(data, ensure_ascii=False)
yield (("\n".join(lines)) + "\n\n").encode("utf-8")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Carry-buffer may emit empty text_delta SSE events

When unmasked starts with < (e.g. a PII token begins at the very start of the combined buffer), lt_pos == 0 and flush_text = unmasked[:0] == "". An SSE event with "delta": {"type": "text_delta", "text": ""} is then yielded to the client. Some Anthropic SDK clients and proxy consumers treat an empty text as unexpected and may log warnings or behave incorrectly.

A simple guard skips the flush event when flush_text is empty:

if flush_text:  # skip emitting empty text_delta
    data["delta"]["text"] = flush_text
    lines[data_idx] = "data: " + _json.dumps(data, ensure_ascii=False)
    yield (("\n".join(lines)) + "\n\n").encode("utf-8")

When flush_text is empty the whole unmasked string should remain in the carry buffer, so carry_text should be set to unmasked (not just unmasked[0:]) in this edge case — but because lt_pos == 0, unmasked[lt_pos:] == unmasked, so carry_text is already correct.

firestaerter3 added a commit to firestaerter3/litellm that referenced this pull request Mar 21, 2026
- Store pii_tokens in request_data["metadata"]["pii_tokens"] matching
  upstream convention (was top-level "pii_tokens")
- Remove self.pii_tokens writes in anonymize_text — only store in
  request_data for thread safety on concurrent requests (P1)
- Remove apply_guardrail overwrite of request_data["pii_tokens"] from
  self.pii_tokens — tokens already stored by anonymize_text (P1)
- Add litellm.output_parse_pii check to streaming gate, matching
  non-streaming behavior (P1)
- Move import re to module level (P2)
- Update all pii_tokens read paths to use metadata.pii_tokens

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
# Unmask path only — applies when unified guardrail calls apply_guardrail()
# with input_type="response". The apply_to_output masking path is handled
# separately by async_post_call_*_hook, not by this method.
if input_type == "response" and self.output_parse_pii:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 apply_guardrail unmask gate ignores litellm.output_parse_pii

The new unmask early-return at line 1498 only checks self.output_parse_pii:

if input_type == "response" and self.output_parse_pii:

Every other unmasking gate in this file checks both the instance flag and the global flag:

  • async_post_call_success_hook (line 941): skips only when both are False
  • async_post_call_streaming_iterator_hook (line 1304): fires when self.output_parse_pii or litellm.output_parse_pii

When litellm.output_parse_pii = True (global config) but self.output_parse_pii = False (per-instance), the streaming path and the direct hook paths will unmask correctly, but the unified-guardrail apply_guardrail response path won't — the condition is False so it falls through to the masking loop and PII in the response gets anonymised a second time.

Suggested change
if input_type == "response" and self.output_parse_pii:
if input_type == "response" and (self.output_parse_pii or litellm.output_parse_pii):

_orig_val = text[_s:_e]
_ar_used.add(_ar_i)
break
_token_value = _orig_val if _orig_val is not None else new_text[start:end]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Fallback _token_value stores anonymized placeholder, not original PII

When no matching entry is found in _sorted_ar (analyze results), the fallback is:

_token_value = _orig_val if _orig_val is not None else new_text[start:end]

At this point, new_text is the anonymized text (redacted_text["text"]), and start:end are coordinates in that anonymized space. So new_text[start:end] yields the Presidio placeholder — e.g., <PHONE_NUMBER>, not the original value such as 555-1234.

This means pii_tokens["<PHONE_NUMBER_<uuid>>"] is set to "<PHONE_NUMBER>" instead of the real phone number. During unmasking, the LLM's echo of <PHONE_NUMBER_<uuid>> gets replaced with <PHONE_NUMBER> rather than the original PII, causing silent data loss.

The original text (text parameter) contains the correct value. A safer fallback would log a warning and skip storing any token for that item (leaving the placeholder in the LLM response is no worse than substituting a different placeholder):

# No matching analyze_results entry found — cannot recover the original value.
# Skip storing a token rather than storing the wrong placeholder.
if _orig_val is None:
    verbose_proxy_logger.warning(
        "anonymize_text: no analyze_results entry for entity_type=%s; "
        "original value cannot be recovered for this token.",
        item.get("entity_type"),
    )
else:
    if request_data is not None:
        ...
        request_data["metadata"]["pii_tokens"][replacement] = _orig_val

This edge case occurs when Presidio's items list contains more entries of a given entity type than the corresponding analyze_results entries (e.g., version mismatch or partial analyzer response).

firestaerter3 and others added 5 commits March 22, 2026 08:23
- Run black formatter on all modified files (fixes CI lint failure)
- Fix pass-through handler to always nest response under "response" key,
  preventing silent overwrite of request_data keys
- Add OCR + MCP handlers to conformance test
- Add streaming signature conformance test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix 6 bugs preventing PII unmasking from working in Presidio guardrail:

1. Position bug in anonymize_text — original values were read from
   anonymized-text coordinates; now uses analyze_results coordinates
2. Lost tokens across instances — pii_tokens stored in request_data
   for cross-instance access (two instances per request)
3. apply_guardrail always masked — add unmask path when input_type
   is "response" and output_parse_pii is enabled
4. Native Anthropic dict responses not handled — add elif branch for
   AnthropicMessagesResponse TypedDict (type:"message")
5. Native Anthropic SSE bytes not handled — add carry-buffer streaming
   unmask that preserves real-time streaming behavior
6. OpenAI-converted tools overwrite native format — strip tools,
   structured_messages, model, images keys from apply_guardrail return

Depends on: fix/guardrail-request-data-v2 (request_data plumbing)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Restore _process_response_for_pii for multi-choice/tool-call/list-content
  unmasking instead of simplified inline code (P1)
- Store pii_tokens in request_data inside anonymize_text for thread safety,
  fixing shared-instance race condition on concurrent requests (P1)
- Add word-boundary anchoring to FALLBACK 1 to prevent false positives (P2)
- Add clarifying comments for entity ordering and unmask path gate (P2)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Store pii_tokens in request_data["metadata"]["pii_tokens"] matching
  upstream convention (was top-level "pii_tokens")
- Remove self.pii_tokens writes in anonymize_text — only store in
  request_data for thread safety on concurrent requests (P1)
- Remove apply_guardrail overwrite of request_data["pii_tokens"] from
  self.pii_tokens — tokens already stored by anonymize_text (P1)
- Add litellm.output_parse_pii check to streaming gate, matching
  non-streaming behavior (P1)
- Move import re to module level (P2)
- Update all pii_tokens read paths to use metadata.pii_tokens

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- F1: Null-safe metadata access — (data.get('metadata') or {}) handles
  metadata=None without TypeError (fixes test_metadata_none_does_not_crash)
- F2: Backslash-safe re.sub in FALLBACK 1 — use lambda replacement to
  prevent backslash interpretation in PII values
- F3: No empty text_delta SSE events — skip yield when flush_text is empty
- F4: Consistent output_parse_pii — apply_guardrail unmask gate now checks
  litellm.output_parse_pii matching all other code paths
- F5: Remove dead self.pii_tokens fallbacks — pii_tokens are per-request
  only, stored in request_data['metadata']['pii_tokens']
- F6: Remove duplicate import json (keep import json as _json)
- F7: Run black formatter (fixes CI lint failure)
- T1: Add test for metadata=None handling
- T2: Add test for backslash in PII values

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment on lines 12 to +13
import json
import json as _json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Redundant import json alias

import json on line 11 and import json as _json on line 12 both bind to the same module object. The _json alias is used exclusively in the SSE streaming path (e.g. _json.loads, _json.dumps, _json.JSONDecodeError). The existing json name imported above is sufficient; introducing a second alias only adds noise and could confuse readers.

Suggested change
import json
import json as _json
import json

Then replace every _json. usage in the streaming code with json..

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Comment on lines +557 to +564
if request_data is not None:
if "metadata" not in request_data:
request_data["metadata"] = {}
if "pii_tokens" not in request_data["metadata"]:
request_data["metadata"]["pii_tokens"] = {}
request_data["metadata"]["pii_tokens"][
replacement
] = _token_value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Silent PII token loss when request_data is None

The previous code explicitly warned when request_data was None and created an empty dict as a fallback:

# old code
if request_data is None:
    verbose_proxy_logger.warning(
        "Presidio anonymize_text called without request_data — "
        "PII tokens cannot be stored per-request. "
        "This may indicate a missing caller update."
    )
    request_data = {}

The new guard silently skips storing the token without any diagnostic message:

if request_data is not None:
    ...
    request_data["metadata"]["pii_tokens"][replacement] = _token_value

When a caller provides request_data=None, PII will be replaced with UUID-suffixed tokens in the prompt but the reverse mapping is never recorded anywhere. The LLM will see <PHONE_NUMBER_1a2b3c4d-e29> tokens that can never be unmasked, and operators get no log indication of why. The warning should be restored to preserve debuggability:

if request_data is None:
    verbose_proxy_logger.warning(
        "Presidio anonymize_text called without request_data — "
        "PII tokens cannot be stored per-request and will not be unmasked."
    )
else:
    if "metadata" not in request_data:
        request_data["metadata"] = {}
    if "pii_tokens" not in request_data["metadata"]:
        request_data["metadata"]["pii_tokens"] = {}
    request_data["metadata"]["pii_tokens"][replacement] = _token_value

Comment on lines +1397 to +1408
if carry_text and carry_lines is not None:
unmasked_carry = self._unmask_pii_text(
carry_text, pii_tokens
)
carry_data["delta"]["text"] = unmasked_carry
carry_lines[carry_data_idx] = "data: " + _json.dumps(
carry_data, ensure_ascii=False
)
yield ("\n".join(carry_lines) + "\n\n").encode("utf-8")
carry_text = ""
carry_lines = None
yield (event_str + "\n\n").encode("utf-8")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 carry_data dereferenced without None-guard even though declared Optional[dict]

carry_data is declared as Optional[dict] and initialized to None. The condition if carry_text and carry_lines is not None does not check carry_data is not None, yet line 1401 immediately dereferences it:

carry_data["delta"]["text"] = unmasked_carry

In practice carry_data and carry_lines are always set and cleared together, so this won't crash under the current logic. However, any future refactor that sets carry_lines without carry_data (or vice-versa) will produce an unguarded TypeError. The same pattern recurs at line 1440.

A minimal fix is to add and carry_data is not None to both guards:

if carry_text and carry_lines is not None and carry_data is not None:

This also satisfies static-analysis tools that would otherwise flag these dereferences as None.

Comment on lines +281 to +284

@pytest.mark.asyncio
async def test_streaming_sse_split_token_carry_buffer(guardrail):
"""Token split across two SSE events is correctly reassembled via carry buffer."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 guardrail.pii_tokens assignment has no effect on the test

test_streaming_anthropic_sse_unmask, test_streaming_sse_split_token_carry_buffer, and test_post_call_success_hook_anthropic_native_dict all set guardrail.pii_tokens = {…} before calling the hook. However, the production code no longer reads from self.pii_tokens — it exclusively reads from request_data["metadata"]["pii_tokens"], which the tests also provide correctly.

The dead assignments create a misleading impression that self.pii_tokens is still the authoritative token store. They should be removed to avoid confusion:

Suggested change
@pytest.mark.asyncio
async def test_streaming_sse_split_token_carry_buffer(guardrail):
"""Token split across two SSE events is correctly reassembled via carry buffer."""
guardrail.output_parse_pii = True

(Remove the guardrail.pii_tokens = … line; the test already passes via request_data.) This pattern appears at the same relative position in the three streaming/hook tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant