You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(pii): label detections by their source (ner vs pattern)
Pattern-matcher hits were stored and masked with an "ner:" prefix even
though no NER (Named Entity Recognition) model was involved, because the
redactor hard-coded the prefix for every detector. Thread a Source through
NERConfig (SourceNER / SourcePattern; empty defaults to ner for
back-compat) and build the synthetic id from it via NERConfig.patternID.
Pattern detections now carry pattern:<GROUP> ids and [REDACTED:pattern:<GROUP>]
masks; NER detections stay ner:<GROUP>. The resolver tags each detector with
its source, and the doc strings / swagger / api-instructions examples are
updated to match.
Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
Copy file name to clipboardExpand all lines: core/http/endpoints/localai/api_instructions.go
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -102,7 +102,7 @@ var instructionDefs = []instructionDef{
102
102
Name: "pii-filtering",
103
103
Description: "Inspect the NER-based PII filter applied to chat requests",
104
104
Tags: []string{"pii"},
105
-
Intro: "PII redaction is NER-based and request-side. A consuming model opts in with `pii: { enabled: true, detectors: [<model>] }` where each detector is a token-classification (token_classify) model. The detection policy lives on the detector model itself in a `pii_detection:` block: `{ min_score, default_action (mask|block|allow), entity_actions: { GROUP: action } }`. Multiple detectors union their hits; overlapping spans resolve to the strongest action (block > mask > allow). PII defaults OFF for non-proxy backends and ON for proxy-* (cloud passthroughs). GET /api/pii/events returns recent redaction events filtered by correlation_id / user_id / pattern_id (events carry `ner:<GROUP>` ids and an 8-char hash prefix, never the matched value; admin or local-user only). The legacy regex pattern tier and its endpoints (/api/pii/patterns, /test, /decide) were removed.",
105
+
Intro: "PII redaction is NER-based and request-side. A consuming model opts in with `pii: { enabled: true, detectors: [<model>] }` where each detector is a token-classification (token_classify) model. The detection policy lives on the detector model itself in a `pii_detection:` block: `{ min_score, default_action (mask|block|allow), entity_actions: { GROUP: action } }`. Multiple detectors union their hits; overlapping spans resolve to the strongest action (block > mask > allow). PII defaults OFF for non-proxy backends and ON for proxy-* (cloud passthroughs). GET /api/pii/events returns recent redaction events filtered by correlation_id / user_id / pattern_id (events carry `<source>:<GROUP>` ids — e.g. `ner:EMAIL` for the neural detector, `pattern:ANTHROPIC_KEY` for the regex pattern tier — and an 8-char hash prefix, never the matched value; admin or local-user only). The legacy regex pattern tier and its endpoints (/api/pii/patterns, /test, /decide) were removed.",
0 commit comments