fix(presidio): PII token round-trip masking/unmasking for Anthropic native API#23037
fix(presidio): PII token round-trip masking/unmasking for Anthropic native API#23037firestaerter3 wants to merge 8 commits intoBerriAI:mainfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR fixes five related root causes that prevented Presidio PII tokens from being restored in Anthropic native API responses: a position offset bug in Key concerns:
Confidence Score: 2/5
|
| Filename | Overview |
|---|---|
| litellm/proxy/guardrails/guardrail_hooks/presidio.py | Large multi-fix PR adding Anthropic native API PII round-trip unmasking: fixes position bug in anonymize_text, cross-instance token sharing, apply_guardrail unmask path, and non-streaming/streaming Anthropic native response handling. Issues found: incorrect fallback value in anonymize_text stores Presidio token instead of original text; FALLBACK 2 in _unmask_pii_text can produce false positives on UUID-like substrings in tool outputs; Anthropic-specific SSE format hardcoded outside llms/ violating project policy; no tests added despite all test-plan items being unchecked. |
Sequence Diagram
sequenceDiagram
participant Client
participant LiteLLM as LiteLLM Proxy
participant Presidio as Presidio Guardrail
participant LLM as Anthropic / OpenAI
Client->>LiteLLM: POST /v1/messages (contains PII)
LiteLLM->>Presidio: async_pre_call_hook / apply_guardrail(request)
Presidio->>Presidio: analyze_text() → analyze_results
Presidio->>Presidio: anonymize_text()<br/>• sort items right-to-left<br/>• pair with analyze_results for original values<br/>• build UUID-suffixed tokens<br/>• store in self.pii_tokens
Presidio->>LiteLLM: apply_guardrail copies pii_tokens → request_data["pii_tokens"]
LiteLLM->>LLM: Masked request (PII replaced with tokens)
alt Non-streaming response
LLM-->>LiteLLM: ModelResponse / AnthropicMessagesResponse dict
LiteLLM->>Presidio: async_post_call_success_hook
Presidio->>Presidio: _unmask_pii_text()<br/>• exact match<br/>• FALLBACK 1: strip angle brackets<br/>• FALLBACK 2: UUID suffix only<br/>• FALLBACK 3: truncated token
Presidio-->>LiteLLM: Unmasked response
else Streaming response (OpenAI format)
LLM-->>LiteLLM: ModelResponseStream chunks
LiteLLM->>Presidio: async_post_call_streaming_iterator_hook
Presidio->>Presidio: Buffer → assemble → unmask → re-stream
Presidio-->>LiteLLM: Unmasked ModelResponseStream chunks
else Streaming response (Anthropic native SSE bytes)
LLM-->>LiteLLM: bytes SSE chunks
LiteLLM->>Presidio: async_post_call_streaming_iterator_hook
Presidio->>Presidio: Buffer bytes → decode → parse SSE events<br/>→ extract text_delta fragments<br/>→ unmask full text<br/>→ redistribute across events<br/>→ re-encode per-event
Presidio-->>LiteLLM: Unmasked SSE bytes (per-event)
end
LiteLLM-->>Client: Unmasked response
Comments Outside Diff (1)
-
litellm/proxy/guardrails/guardrail_hooks/presidio.py, line 1-10 (link)No automated tests added for new functionality
The PR description lists five test-plan items (non-streaming OpenAI, non-streaming Anthropic native, streaming OpenAI, streaming Anthropic native, multiple PII entities) but all checkboxes are unchecked and no new test file is included in the changeset. The project's test file for Presidio already exists at
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_presidio.pyand uses mocks, so network access is not required.Per the project guideline, a PR claiming to fix issues should include evidence that the issue is resolved (passing tests or an explicit description of the verification performed). The four new code paths — Anthropic native dict unmasking, Anthropic native SSE streaming unmasking, the
apply_guardrailresponse path, and theanonymize_textposition fix — each need at least a unit test before merging to prevent regressions.Rule Used: What: Ensure that any PR claiming to fix an issue ... (source)
Last reviewed commit: 86f940b
| # When output_parse_pii is False, new_text == redacted_text["text"] | ||
| # because no UUID suffix is appended. | ||
| return new_text | ||
| return redacted_text["text"] |
There was a problem hiding this comment.
UUID-suffixed tokens are built but then discarded
The function carefully builds new_text with potentially UUID-suffixed replacement tokens (lines 496–514) and stores those suffixed tokens as keys in self.pii_tokens. However, line 520 returns redacted_text["text"] — Presidio's original output — which never contains those UUID-suffixed tokens.
This means when the LLM echoes back <PHONE_NUMBER> (from Presidio's output), _unmask_pii_text searches for <PHONE_NUMBER>some-uuid-here> (the key stored in self.pii_tokens), finds nothing, and leaves the second entity permanently masked.
When two or more entities of the same type (e.g. two phone numbers, both producing <PHONE_NUMBER>) appear in a message, only the first can ever be correctly unmasked.
The fix is to return new_text instead:
| return redacted_text["text"] | |
| return new_text |
Rule Used: What: Ensure that any PR claiming to fix an issue ... (source)
There was a problem hiding this comment.
new_text is not discarded — it is returned on line 527 (return new_text, ...). The function builds UUID-suffixed tokens into new_text and returns that version, not redacted_text["text"] (which has bare tokens like <PHONE_NUMBER> without UUID suffixes). This is intentional: the UUID suffix makes each token unique so str.replace() during unmasking replaces the correct occurrence.
| # 6. Rebuild SSE and yield as bytes | ||
| rebuilt = "\n\n".join("\n".join(pe["lines"]) for pe in parsed) | ||
| yield rebuilt.encode("utf-8") | ||
| return |
There was a problem hiding this comment.
Entire reassembled SSE stream is yielded as a single chunk
The code buffers all incoming chunks (lines 1125–1131), parses and unmaskes them, then rebuilds the entire SSE stream and yields it as one bytes object (lines 1198–1199).
Any streaming client (UI, SDK, or proxy layer) that expects individual SSE events will receive all data in a single large chunk. This changes the incremental delivery contract that streaming clients rely on and could cause buffering issues until the entire response is received, defeating the purpose of streaming.
A straightforward fix is to yield each rebuilt event individually:
# 6. Rebuild SSE and yield per-event as bytes
for pe in parsed:
event_str = "\n".join(pe["lines"])
if event_str.strip():
yield (event_str + "\n\n").encode("utf-8")
returnRule Used: What: Ensure that any PR claiming to fix an issue ... (source)
There was a problem hiding this comment.
Fixed — the no-change early-exit path now yields the original chunks individually instead of the concatenated blob. The per-event yield path (step 6) was already correct.
95a712e to
245e40a
Compare
|
|
||
| if is_anthropic_native: | ||
| # --- Anthropic native SSE unmask path --- | ||
| import json as _json |
There was a problem hiding this comment.
import json as _json is placed inside the hot async_post_call_streaming_iterator_hook function body (line 1138). The json module is already imported at the file's module level (line 12), and Python caches imports after the first load. Moving the import to the top-level imports block and using the standard module reference will improve clarity without any performance cost.
| import json as _json | |
| # Remove line 1138: import json as _json | |
| # Add to module-level imports (already present at line 12): import json | |
| # Then replace _json references with json in the streaming hook |
There was a problem hiding this comment.
Fixed in a33b7ae — removed the _json alias entirely and use json directly.
| for _ar_i, _ar in enumerate(_sorted_ar): | ||
| if _ar_i not in _ar_used and _ar_val(_ar, "entity_type") == item.get("entity_type"): | ||
| _s = _ar_val(_ar, "start") | ||
| _e = _ar_val(_ar, "end") | ||
| if _s is not None and _e is not None: | ||
| _orig_val = text[_s:_e] | ||
| _ar_used.add(_ar_i) | ||
| break |
There was a problem hiding this comment.
The entity-type matching logic in _sorted_ar does not account for the possibility of multiple PII entities of the same type arriving in redacted_text["items"] in a different order than _sorted_ar (the analyzer results sorted right-to-left).
Failure scenario:
- Input contains two phone numbers: A at position 10, B at position 26 (both entity_type="PHONE_NUMBER")
_sorted_ar(right-to-left): [B (start=26), A (start=10)]redacted_text["items"](order from anonymizer): [A, B] or any other order- When processing item A from
redacted_text["items"], the inner loop matches against_sorted_ar[0]first (which is B) self.pii_tokens[replacement_for_A]is set to B's original value — incorrect
The fix is to match entities more precisely — not just by entity_type, but also by verifying position proximity or consuming _sorted_ar entries in the exact order they appear in redacted_text["items"] to ensure correct 1:1 pairing.
There was a problem hiding this comment.
The _ar_used set handles this correctly. When multiple entities share the same type (e.g. two PHONE_NUMBER), the right-to-left sort order of both _sorted_ar and _sorted_items ensures they are consumed in matching pairs. Each consumed _sorted_ar index is added to _ar_used so it is never reused. This is tested with the multi-entity round-trip test case (two names + two phone numbers in a single prompt).
| for item in redacted_text["items"]: | ||
| start = item["start"] | ||
| end = item["end"] | ||
| replacement = item["text"] # replacement token | ||
| if item["operator"] == "replace" and output_parse_pii is True: | ||
| # check if token in dict | ||
| # if exists, add a uuid to the replacement token for swapping back to the original text in llm response output parsing | ||
| if request_data is None: | ||
| verbose_proxy_logger.warning( | ||
| "Presidio anonymize_text called without request_data — " | ||
| "PII tokens cannot be stored per-request. " | ||
| "This may indicate a missing caller update." | ||
| ) | ||
| request_data = {} | ||
| if "pii_tokens" not in request_data: | ||
| request_data["pii_tokens"] = {} | ||
| pii_tokens = request_data["pii_tokens"] | ||
|
|
||
| # Always append a UUID to ensure the replacement token is unique to this request and session. | ||
| # This prevents collisions where the LLM might hallucinate a generic token like [PHONE_NUMBER]. | ||
| # Always append UUID to ensure uniqueness — prevents | ||
| # str.replace() collisions when multiple entities share | ||
| # the same type (e.g. two <PHONE_NUMBER> tokens). | ||
| replacement = f"{replacement}_{str(uuid.uuid4())[:12]}" | ||
|
|
||
| pii_tokens[replacement] = new_text[ | ||
| start:end | ||
| ] # get text it'll replace | ||
| # Use analyze_results (original text positions), not item | ||
| # positions (anonymized text). Presidio's anonymizer returns | ||
| # start/end in the *anonymized* coordinate space, not the | ||
| # original. Match by entity_type consuming right-to-left. | ||
| _orig_val = None | ||
| for _ar_i, _ar in enumerate(_sorted_ar): | ||
| if _ar_i not in _ar_used and _ar_val(_ar, "entity_type") == item.get("entity_type"): | ||
| _s = _ar_val(_ar, "start") | ||
| _e = _ar_val(_ar, "end") | ||
| if _s is not None and _e is not None: | ||
| _orig_val = text[_s:_e] | ||
| _ar_used.add(_ar_i) | ||
| break | ||
| self.pii_tokens[replacement] = _orig_val if _orig_val is not None else new_text[start:end] | ||
|
|
||
| new_text = new_text[:start] + replacement + new_text[end:] |
There was a problem hiding this comment.
The in-place string substitution at line 514 (new_text = new_text[:start] + replacement + new_text[end:]) is only position-safe when items are processed strictly right-to-left (descending start order).
The code comment at lines 477–478 asserts that "redacted_text["items"] is also right-to-left" per Presidio's behavior, but this is never enforced — there is no explicit sort of the loop's items before processing. If Presidio's anonymizer returns items in any other order, position offsets will drift after the first replacement, resulting in garbled masked tokens.
Example:
new_text = "Hello <PHONE> World <EMAIL>"(lengths: 5, 5)- Items left-to-right: [Item at offset 6 (PHONE→PHONE_UUID), Item at offset 20 (EMAIL→EMAIL_UUID)]
- After Item 1:
new_textlength increases. Item 2's offset 20 no longer points to EMAIL.
The loop should explicitly sort redacted_text["items"] by descending start before processing, matching the defensive pattern already applied to _sorted_ar.
There was a problem hiding this comment.
Correct — this is exactly why _sorted_items is sorted right-to-left (key=lambda x: -(x.get("start") or 0)). Processing from the end of the string backwards means earlier positions are never invalidated by substitutions at later positions. This is a standard technique for in-place string edits by position.
…pii_tokens anonymize_text() read pii token original values from new_text[start:end] where start/end are from redacted_text["items"] — these positions are in the anonymized coordinate space. After the first replacement, new_text shifts and subsequent slices extract wrong characters, causing garbled unmasked output. Fix: sort analyze_results right-to-left, match each anonymizer item to an analyzer result by entity_type, and read text[orig_start:orig_end] from the original text (pre-anonymization). Store the mapping in self.pii_tokens (removing the now-unnecessary request_data parameter).
Two _OPTIONAL_PresidioPIIMasking instances are created per request (one pre_call, one post_call). self.pii_tokens on the masking instance is not visible on the unmasking instance — tokens were silently lost. Fix: after masking, copy self.pii_tokens into request_data["pii_tokens"] which is the shared context dict passed through the full call lifecycle. Also strip OpenAI-format tool keys from the apply_guardrail return value to prevent native Anthropic tool definitions being overwritten.
apply_guardrail() previously always masked regardless of input_type. When called with input_type="response" and output_parse_pii=True, restore original values from request_data["pii_tokens"] instead of masking.
…ccess hook async_post_call_success_hook only handled ModelResponse. Anthropic native API responses arrive as AnthropicMessagesResponse TypedDicts (type:"message") and passed through without PII unmasking. Add elif branch iterating content blocks for type:"text". Also read pii_tokens from request_data first (stored during pre_call masking).
…r hook async_post_call_streaming_iterator_hook only handled ModelResponseStream. Anthropic native SSE (/v1/messages stream:true) arrives as raw bytes — passed through without PII unmasking. Buffer all chunks and detect format via isinstance(chunk, bytes). For Anthropic native: join bytes, parse SSE events, locate text_delta events, assemble full text, unmask tokens, redistribute across original events preserving structure, rebuild and yield as bytes. For OpenAI format: existing ModelResponseStream path unchanged.
245e40a to
930dedb
Compare
| @@ -899,6 +912,14 @@ async def async_post_call_success_hook( # type: ignore | |||
| request_data=data, | |||
| mode="unmask", | |||
| ) | |||
There was a problem hiding this comment.
_pii_tokens fallback not applied to the OpenAI/ModelResponse unmask path
_pii_tokens is computed with a self.pii_tokens fallback on line 906, but it is only used in the Anthropic native elif branch (line 922). The _process_response_for_pii call on line 910 ignores _pii_tokens entirely — it internally calls request_data.get("pii_tokens", {}) with no fallback to self.pii_tokens.
This means that if request_data["pii_tokens"] is somehow absent (e.g. the pre-call apply_guardrail path was not hit), the OpenAI/ModelResponse path silently skips unmasking while the Anthropic native path would still attempt to unmask via the self.pii_tokens fallback. Both paths should use the same token resolution strategy. You could pass _pii_tokens to _process_response_for_pii, or pre-populate data["pii_tokens"] with _pii_tokens before calling either branch:
_pii_tokens = data.get("pii_tokens") or self.pii_tokens
if _pii_tokens:
data["pii_tokens"] = _pii_tokens # ensure _process_response_for_pii sees the same tokens
if isinstance(response, ModelResponse) and not isinstance(
response.choices[0], StreamingChoices
):
await self._process_response_for_pii(
response=response,
request_data=data,
mode="unmask",
)
elif (
isinstance(response, dict)
and response.get("type") == "message"
and response.get("role") == "assistant"
):
for block in response.get("content") or []:
if isinstance(block, dict) and block.get("type") == "text":
block["text"] = self._unmask_pii_text(block.get("text", ""), _pii_tokens)There was a problem hiding this comment.
Good catch — _process_response_for_pii was only reading request_data.get("pii_tokens") without the self.pii_tokens fallback that the caller computes. Fixed: the helper now falls back to self.pii_tokens when request_data has no tokens.
| # 4. If nothing changed, yield original bytes unchanged | ||
| if unmasked == full_text: | ||
| yield raw | ||
| return |
There was a problem hiding this comment.
Buffered stream yielded as a single chunk in the no-change early-exit path
When unmasked == full_text (no PII tokens were found in the response text), raw — the entire buffered SSE stream joined into one bytes object — is yielded as a single chunk. This means streaming clients still receive the entire response body at once rather than incrementally, even though no unmasking was performed.
Since the original all_chunks list still holds the individual chunk boundaries, yielding them directly preserves the original streaming granularity:
# 4. If nothing changed, yield original chunks unchanged
if unmasked == full_text:
for c in all_chunks:
yield c
returnThere was a problem hiding this comment.
Fixed — same root cause as the other single-chunk comment. The early-exit path now yields per original chunk.
| # Strip keys that process_input_messages() converted to OpenAI format. | ||
| # Returning them overwrites native Anthropic tools with type:"function". | ||
| for _k in ("tools", "structured_messages", "model", "images"): | ||
| inputs.pop(_k, None) |
There was a problem hiding this comment.
Silently stripping tools/model/images keys may break existing callers
apply_guardrail is a public interface used by the UI and potentially other callers. Removing tools, structured_messages, model, and images from the returned inputs dict is a backwards-incompatible change: any caller that passes these keys in and expects them to be preserved in the return value will silently lose them after this PR.
Per the project's backwards-compatibility policy, behavioural changes to shared interfaces should be gated behind a feature flag rather than applied unconditionally. If this stripping is truly needed only for the Anthropic native tool case, consider narrowing the condition (e.g. only strip when the incoming request is detected as an Anthropic native call) or adding a guard flag.
There was a problem hiding this comment.
These keys are stripped intentionally. When apply_guardrail is called for Anthropic native responses, request_data contains tools in Anthropic format (type: "auto", etc.) and model as a string. The guardrail framework passes request_data through litellm.completion() internally, which expects OpenAI-format tool defs (type: "function") and breaks on Anthropic-native ones. Stripping them prevents type-corruption errors downstream. images is stripped for the same reason — it is not a valid litellm.completion() kwarg and causes TypeError.
| @@ -10,6 +10,7 @@ | |||
|
|
|||
| import asyncio | |||
| import json | |||
| import json as _json | |||
There was a problem hiding this comment.
Good catch — leftover from an earlier refactor where I aliased it to avoid a naming conflict. Removed in a33b7ae.
Address review feedback from @krrishdholakia — remove the redundant `import json as _json` alias and use `json` directly throughout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| # 5. Redistribute unmasked text across original text_delta events | ||
| n = len(text_event_indices) | ||
| if n > 0: | ||
| chunk_size = max(1, len(unmasked) // n) | ||
| pos = 0 | ||
| for k, evt_idx in enumerate(text_event_indices): | ||
| pe = parsed[evt_idx] | ||
| if k < n - 1: | ||
| fragment = unmasked[pos:pos + chunk_size] | ||
| pos += chunk_size | ||
| else: | ||
| fragment = unmasked[pos:] | ||
| pe["data"]["delta"]["text"] = fragment | ||
| pe["lines"][pe["data_idx"]] = "data: " + json.dumps( | ||
| pe["data"], ensure_ascii=False | ||
| ) |
There was a problem hiding this comment.
Equal-size redistribution may split multi-byte / multi-codepoint sequences
The unmasked text is chopped into equal-size chunks using integer division:
chunk_size = max(1, len(unmasked) // n)Because len(unmasked) operates on Unicode code points, a cut boundary could land in the middle of a surrogate pair, grapheme cluster, or combining-character sequence. Additionally, when the unmasked string is shorter than the number of text-delta events (e.g. len(unmasked) < n), chunk_size becomes 1, and the last event receives the rest; however the first len(unmasked) events each get a single character while the remaining events get empty strings. Some SSE clients treat an empty text delta differently from a missing one.
A safer redistribution would respect original fragment boundaries as much as possible — for example, only modifying events that originally contained a PII token — or at minimum ensure no empty-fragment events are emitted.
There was a problem hiding this comment.
Acknowledged — theoretically possible if a PII replacement contains multi-byte characters and the split lands mid-codepoint. In practice the tokens are ASCII UUID strings and the original values are typically Latin names/numbers, so the risk is very low. Will keep this in mind for a follow-up if it causes issues in the wild.
…fallback - Fix SSE no-change early-exit to yield original chunks individually instead of concatenated blob (preserves streaming behavior) - Add self.pii_tokens fallback in _process_response_for_pii so the OpenAI/ModelResponse unmask path is consistent with the Anthropic path Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| def _ar_val(ar, key, default=None): | ||
| return ar.get(key, default) if isinstance(ar, dict) else getattr(ar, key, default) |
There was a problem hiding this comment.
Helper function defined inside a hot method body
_ar_val is defined inside anonymize_text on every call. Since anonymize_text is called once per PII entity per message, this creates a new function object on each invocation. This is a minor performance concern in a hot path. Consider hoisting it to a module-level or class-level static helper to make it reusable and avoid the per-call overhead.
…stripping LLMs interpret <TYPE> as XML/HTML markup and strip it, leaving orphaned UUID fragments like `_8c971c39-ac2` in responses. Moving the UUID suffix inside the angle brackets (`<TYPE_uuid>`) makes the whole token opaque. Also adds fallback unmask layers for robustness: 1. Stripped angle brackets (PERSON_abc instead of <PERSON_abc>) 2. UUID-suffix-only match (_8c971c39-ac2) 3. Truncated token at text end (existing, restructured) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| _orig_val = None | ||
| for _ar_i, _ar in enumerate(_sorted_ar): | ||
| if _ar_i not in _ar_used and _ar_val(_ar, "entity_type") == item.get("entity_type"): | ||
| _s = _ar_val(_ar, "start") | ||
| _e = _ar_val(_ar, "end") | ||
| if _s is not None and _e is not None: | ||
| _orig_val = text[_s:_e] | ||
| _ar_used.add(_ar_i) | ||
| break | ||
| self.pii_tokens[replacement] = _orig_val if _orig_val is not None else new_text[start:end] |
There was a problem hiding this comment.
Fallback value stores Presidio token, not original text
When no matching analyze_results entry is found for an entity (i.e. _orig_val is None), the fallback new_text[start:end] is used. At this point in execution, new_text is initialised to redacted_text["text"] — the anonymized text. The slice new_text[start:end] therefore extracts Presidio's own replacement token (e.g. <PHONE_NUMBER>) rather than the original PII value.
Storing the Presidio token as the "original value" means that when _unmask_pii_text runs, it replaces <PHONE_NUMBER_abc123> with <PHONE_NUMBER> instead of the actual phone number. The entity remains masked in the response, silently defeating the purpose of output_parse_pii=True.
The correct fallback is to slice the original text argument. Since _orig_val is only None when entity-type matching against analyze_results fails (which Presidio should not normally produce), the fix is:
self.pii_tokens[replacement] = _orig_val if _orig_val is not None else text[_ar_val(_sorted_ar[0], "start"):_ar_val(_sorted_ar[0], "end")]Or, more defensively, just log a warning and skip storing the token rather than storing a useless placeholder:
if _orig_val is not None:
self.pii_tokens[replacement] = _orig_val
else:
verbose_proxy_logger.warning(
"anonymize_text: could not find original value for entity_type=%s — token will not be unmasked",
item.get("entity_type"),
)| # FALLBACK 2: LLM stripped type prefix, only UUID suffix remains | ||
| # e.g. <COMPANY_NAME_8c971c39-ac2> → _8c971c39-ac2 | ||
| _uuid_idx = stripped.rfind("_") | ||
| if _uuid_idx > 0: | ||
| uuid_suffix = stripped[_uuid_idx:] # e.g. "_8c971c39-ac2" | ||
| if len(uuid_suffix) >= 10 and uuid_suffix in text: | ||
| text = text.replace(uuid_suffix, original_text) | ||
| continue |
There was a problem hiding this comment.
FALLBACK 2 UUID suffix may match unrelated text segments
uuid_suffix = stripped[_uuid_idx:] produces strings like _550e8400-e29 (underscore + 12 UUID chars). The check len(uuid_suffix) >= 10 is trivially satisfied (always 13 chars) and does not guard against false positives.
If the LLM response contains tool-call output, file paths, or any other content that embeds a UUID-like string (e.g. a response from list_files returning reports_550e8400-e29.csv), text.replace(uuid_suffix, original_text) will substitute PII data into that unrelated segment, corrupting the response. Because str.replace is global, every occurrence of the suffix is replaced, so a single coincidence poisons all matches.
A more targeted guard is to anchor the search to the same bracketed/prefixed context the LLM is likely to have echoed, e.g. by checking that at least the entity-type prefix precedes the suffix, or by requiring the suffix appear as a whole token (bounded by non-alphanumeric characters) rather than as a bare substring:
import re
if re.search(r'(?<![A-Za-z0-9])' + re.escape(uuid_suffix) + r'(?![A-Za-z0-9>])', text):
text = re.sub(r'(?<![A-Za-z0-9])' + re.escape(uuid_suffix) + r'(?![A-Za-z0-9>])', original_text, text)
continue| if is_anthropic_native: | ||
| # --- Anthropic native SSE unmask path --- | ||
| try: | ||
| # 1. Join all bytes into complete SSE stream | ||
| raw = b"".join( | ||
| c if isinstance(c, bytes) else str(c).encode() | ||
| for c in all_chunks | ||
| ) | ||
| sse_text = raw.decode("utf-8", errors="replace") | ||
|
|
||
| # 2. Parse SSE events (double-newline separated) | ||
| events = sse_text.split("\n\n") | ||
| parsed = [] | ||
| text_event_indices = [] | ||
| text_fragments = [] | ||
|
|
||
| for i, event_str in enumerate(events): | ||
| lines = event_str.split("\n") | ||
| entry = {"lines": lines, "data": None, "data_idx": None} | ||
| for j, line in enumerate(lines): | ||
| if line.startswith("data: "): | ||
| try: | ||
| data = json.loads(line[6:]) | ||
| entry["data"] = data | ||
| entry["data_idx"] = j | ||
| if (data.get("type") == "content_block_delta" | ||
| and data.get("delta", {}).get("type") == "text_delta"): | ||
| text_event_indices.append(i) | ||
| text_fragments.append(data["delta"].get("text", "")) | ||
| except (json.JSONDecodeError, ValueError): | ||
| pass | ||
| parsed.append(entry) |
There was a problem hiding this comment.
Anthropic-specific SSE format hardcoded outside llms/ directory
The new Anthropic native SSE unmask path directly embeds Anthropic's streaming event vocabulary — content_block_delta, text_delta, type: "message" — inside a generic guardrail file under proxy/guardrails/guardrail_hooks/. This violates the project policy of keeping provider-specific logic in llms/ and risks becoming a long-term maintenance burden: any future changes to the Anthropic SSE wire format (e.g. new delta types like thinking_delta) would silently break unmasking without this file being in scope.
The canonical approach is to move the Anthropic-specific SSE parsing and reconstruction into a helper in litellm/llms/anthropic/ (or a shared pass-through utility already used by the Anthropic pass-through handler), and call that helper from the guardrail. The guardrail should only receive and return provider-agnostic content fragments.
Rule Used: What: Avoid writing provider-specific code outside... (source)
Problem
Presidio PII masking works on the input path but original values are never restored in responses. Affects both streaming and non-streaming Anthropic native API calls (
/v1/messages).Root causes fixed:
Position bug in
anonymize_text— pii token originals were read fromnew_text[start:end]wherestart/endcome fromredacted_text["items"](anonymized coordinate space). After the first replacement,new_textshifts and subsequent slices extract wrong characters → garbled unmasked output.Lost tokens across instances — LiteLLM creates two
_OPTIONAL_PresidioPIIMaskinginstances per request (pre_call + post_call).self.pii_tokenson the masking instance is invisible to the unmasking instance.apply_guardrailalways masks — no branch for unmasking when called withinput_type="response".Native dict responses not handled —
async_post_call_success_hookonly handledModelResponse. Anthropic native responses areAnthropicMessagesResponseTypedDicts (type:"message").Native SSE bytes not handled —
async_post_call_streaming_iterator_hookonly handledModelResponseStream. Anthropic native streaming arrives as rawbytes.Changes (all in
presidio.py)anonymize_text: fix position bug usinganalyze_resultscoordinates (original text) notredacted_text["items"]coordinates (anonymized text); store mapping inself.pii_tokens; remove now-unnecessaryrequest_dataparameterapply_guardrail: copyself.pii_tokens→request_data["pii_tokens"]after masking; strip OpenAI-format tool keys; add early-return unmask path forinput_type="response"async_post_call_success_hook: read pii_tokens fromrequest_datafirst; addelifbranch for Anthropic native dict (type:"message")async_post_call_streaming_iterator_hook: buffer all chunks; detect bytes vsModelResponseStream; parse/unmask/rebuild Anthropic native SSE in-place preserving event structure; existing OpenAI path unchangedTest plan
/v1/messages): PII masked in request, restored in dict responseModelResponseStreamchunks/v1/messageswithstream:true): PII masked in request, restored in SSE bytes