Skip to content

feat(middleware): Use NER models instead of Regex for PII filtering#10160

Draft
richiejp wants to merge 31 commits into
mudler:masterfrom
richiejp:feat/pii-ner-tier
Draft

feat(middleware): Use NER models instead of Regex for PII filtering#10160
richiejp wants to merge 31 commits into
mudler:masterfrom
richiejp:feat/pii-ner-tier

Conversation

@richiejp

@richiejp richiejp commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Description

This builds on the top of the request routing fixes PR.

It replaces the regex patterns for PII with a NER model. This allows information such as names, birth dates, addresses and so on to be redacted/blocked, which are impossible with regex. Potentially we can still have regex (or some other non-neural) via a backend as well.

We loose the PII filtering on the responses however because doing NER on a streamed response is more difficult than regex, but I think we could add it back in if needed.

Notes for Reviewers

  • feat(pii): inbound encoder/NER detection tier
  • build(nix): add C++ gRPC to the dev shell for the llama-cpp backend
  • feat(llama-cpp): TokenClassify RPC for openai-privacy-filter NER
  • feat(config): add token_classify known_usecase for the PII NER tier
  • feat(gallery): add privacy-filter-multilingual token-classify model
  • refactor(pii): NER-centric PII filter; remove the regex tier
  • docs(pii): gallery pii_detection policy + NER-centric docs
  • feat(ui): NER-centric PII editor; drop the regex pattern UI

Signed commits

  • Yes, I signed my commits.

richiejp added 30 commits June 7, 2026 08:53
Conversation trimming runs through the classifier model's chat template
and trims by exact token count, sized to the model's n_batch which is
now scaled to context so long probes can't crash the backend. Missing
chat_message templates are a hard error at router build time. Router-
facing factories (Embedder/Scorer/Reranker/TokenCounter) re-resolve
ModelConfig per call so a model installed post-startup doesn't bind a
stub Backend="" config and silently fall into the loader's auto-
iterate path.

New 'vector_store' backend trace recorded inside localVectorStore on
every Search/Insert — including the backend-load-failure path that
previously vanished into an xlog.Warn — with outcome tagging
(hit/miss/empty_store/backend_load_error/find_error/insert_error/ok).
Companion cleanup drops misleading similarity:0 and input_tokens_count:0
from non-hit and text-mode traces.

Gallery local-store-development aliases to 'local-store' so the master
image satisfies pkg/model.LocalStoreBackend lookups from the embedding
cache.

Misc: llama-cpp TokenizeString reads the correct 'prompt' JSON key
(the original bug); ModelTokenize nil-guard; non-fatal mitm proxy
startup; PII 'route_local' renamed to 'allow' with docs/UI in sync;
model-editor footer no longer eats the edit area on small screens;
several config-editor template/dropdown/section fixes.

Tests: e2e router specs (casual/code-hint + long-conversation trim),
vector_store trace specs, lazy-factory specs, gallery dev-alias
resolution, Playwright trace badge + scroll regression.

Assisted-by: Claude:claude-opus-4-7 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>
…dels

Embedding and rerank models pool over the whole input in a single physical batch (n_ubatch). With batch left at the 512 default, the backend rejects longer inputs with "input is too large to process", silently capping a large-context embedder (e.g. 8k/32k) at 512 tokens. Size n_batch to the context for these single-pass usecases, mirroring the existing FLAG_SCORE behaviour; an explicit batch: still wins.

Extracts EffectiveContextSize/EffectiveBatchSize from grpcModelOpts so the effective decode window has one home for other callers to reuse.

Adds an e2e-aio regression test that embeds a >512-token input. The AIO embedding model is switched to nomic-embed-text-v1.5 (2048 context) because the previous granite model was capped at 512 tokens and could not exercise the larger batch.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
Signed-off-by: Richard Palethorpe <io@richiejp.com>
Scoring decodes the whole prompt+candidate in a single llama_decode and
reads one logit row per candidate token. The vendored llama.cpp server
caps causal output rows at n_parallel, so the default of 1 aborts with
GGML_ASSERT(n_outputs_max <= cparams.n_outputs_max) on multi-token route
labels. Set options: [parallel:64] on both arch-router quant entries to
lift the cap; kv_unified (the grpc-server default) keeps the full context
per sequence, so this does not split the KV cache.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
Layers an optional token-classification (NER) tier on top of the regex
PII redactor for the inbound chat/messages middleware. When a model's
pii.ner.model is set, RequestMiddleware (via the new pii.WithNERResolver
option) resolves a detector over the shared model loader and runs
RedactWithNER (regex + NER merged); without it, redaction stays
regex-only, so existing four-arg call sites and regex-only stubs are
unaffected.

- core/config: PIINERConfig (model, min_score, default_action,
  entity_actions) under PIIConfig; default action is mask (safe-by-
  default for a PII filter). Registry entries + grandfather for the
  map field so the field-registry coverage test stays green.
- core/application: PIINERResolver binds a token-classifier
  (piidetector) over the model loader, lazily — the model loads on
  first Detect; unknown/unconfigured names resolve to nil.
- core/services/routing/pii: the middleware fails CLOSED on the NER
  tier — if a model has pii.ner.model set but the tier cannot run
  (detector errors at request time, or the model can't be resolved to
  a detector), the request is blocked with 503 pii_ner_unavailable and
  a fail-closed audit event, rather than silently downgrading to
  regex-only. The redactor's RedactWithNER stays fail-open (returns a
  best-effort regex result + error); the block policy lives in the
  middleware.
- core/services/routing/piidetector: detector backing the NER tier.
- core/backend: TokenClassify backend call (gRPC TokenClassifyRequest/
  Response) + tests.
- backend/python/transformers: TokenClassify now emits UTF-8 byte
  offsets (proto contract) instead of HF codepoint offsets, and returns
  the exact text slice. Fixes wrong spans on multibyte/multilingual
  input.

llama.cpp privacy-filter arch (Phase 1, carry-patches under
backend/cpp/llama-cpp/patches, applied by prepare.sh; all five apply in
order and compile against the current pin 5dcb71166):
- 0001 TOKEN_CLS pooling substrate (reduced subset of upstream #19725).
- 0002 registers the openai-privacy-filter architecture + gguf-py
  arch/tensor mappings (score -> cls.output).
- 0003 HF->GGUF converter (OpenAIPrivacyFilterModel, a GptOssModel
  subclass); validated end-to-end against OpenMed/privacy-filter-
  multilingual (157-tensor F16 GGUF, metadata verified). Splits the
  expert gate_up as concatenated halves (not gpt-oss's interleaved
  ::2/1::2) and writes per-dim rope_freqs.weight carrying HF's exact
  YaRN inv_freq (truncate=false), since ggml's shared YaRN ramp
  floor/ceils the correction band.
- 0004 model graph + loader wiring (llama_model_openai_privacy_filter):
  gpt-oss MoE body as a bidirectional token classifier — no KV cache,
  uniform symmetric sliding-window band, attention sinks, no LM head;
  ends at the per-token hidden states so the framework's TOKEN_CLS
  pooling applies the cls.output head per token. Uses the interleaved
  (GPT-J) LLAMA_ROPE_TYPE_NORM layout — unlike gpt-oss's NEOX — and
  feeds the per-layer rope_freqs into ggml_rope_ext with ggml's YaRN
  ramp disabled (mscale kept via rope_attn_factor).
- 0005 no-cache all-SWA mask fix (llama-graph.cpp): an encoder whose
  every layer is SWA leaves the full (non-windowed) attention mask
  unallocated; set_input now only fills a mask that actually got a
  buffer, else the model aborts on the first decode.

Status: parity solved. The new arch matches the HF reference token-for-
token against OpenMed/privacy-filter-multilingual at F16 — 12/12 argmax,
full-logit cosine = 1.0, every layer's residual stream cos = 1.0
(relerr ~2e-4 = F16 rounding), including the e-mail BIOES span. Verified
on the real llama-embedding binary with model-default TOKEN_CLS pooling.
Root cause of the earlier attenuation was two independent RoPE bugs (NEOX
vs interleaved/NORM dim-pairing, dominant; plus ggml's YaRN truncate
rounding), both fixed in 0003/0004. The two parity-gated assumptions
(n_swa = 2*sliding_window and the gate_up packing) are confirmed correct.

Plans/integration notes under docs/plans/pii-ner-ggml.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
The devShell shipped `protobuf` (for Go proto generation) but no C++
gRPC, so `make grpc-server` in backend/cpp/llama-cpp could not locate
gRPC via find_package(gRPC) and fell back to a stale, version-skewed
grpc from the store (protobuf 34.1 headers vs a grpc built against
32.1), aborting on a protobuf gencode mismatch.

nixpkgs builds `grpc` against the same `protobuf`, so adding it gives a
self-consistent C++ stack. Docker (backend/Dockerfile.base-grpc-builder)
compiles gRPC v1.65.0 / protoc v27.1 from source; the nixpkgs pair here
(grpc 1.80 / protobuf 34) is newer but wire/ABI-consistent. Verified a
clean cmake build of grpc-server inside `nix develop` with no manual
flag overrides.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
Implements the TokenClassify gRPC primitive in the vendored llama.cpp
backend, completing Phase 2 of the PII NER tier. Mirrors Score's
direct-decode strategy (bypassing the slot/task queue under the same
conflict_guard + mutex) because it needs full control over batch output
flags, per-token logit readout, and overlapping-window stitching.

Pipeline: tokenize (o200k) with UTF-8 byte offsets -> windowed
non-causal forward -> per-token n_cls_out logits via
llama_get_embeddings_ith -> fp32 log_softmax -> constrained linear-chain
BIOES Viterbi (the model's transition biases are 0.0, so structural
constraints only) -> span assembly -> whitespace-trimmed byte spans ->
TokenClassifyEntity{entity_group, start, end, score, text}.

Windowing uses a halo of n_layer*sliding_window: a symmetric +/-128 band
per layer compounds across the 8 layers, so a token's logits depend on
+/-1024 neighbours, not +/-128 (short inputs stay a single exact
forward). Requires a TOKEN_CLS-pooling model loaded with embeddings
enabled.

Validated end-to-end against OpenMed/privacy-filter-multilingual at F16:
correct entities across English/German with byte-exact multibyte offsets
(the 2-byte U+00FC in "Muller" is spanned correctly).

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
Adds FLAG_TOKEN_CLASSIFY, mirroring FLAG_SCORE's explicit-opt-in pattern
for an internal direct-decode RPC: it's declared via
`known_usecases: [token_classify]`, is authoritative (HasUsecases won't
paint chat/embeddings on top via the heuristic), and has no guessing
heuristic. On llama-cpp Validate() rejects combining it with
chat/completion (TokenClassify bypasses the slot loop and races
generation) but allows embeddings, which TOKEN_CLS pooling requires.

Like FLAG_SCORE this is intentionally not registered in
backend_capabilities.go's UsecaseInfoMap: it has no public REST route
(the PII redactor's NER tier calls TokenClassify directly), so it stays
a known_usecases flag only.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
Gallery entry for the OpenMed/privacy-filter-multilingual PII NER model,
converted to GGUF (F16) for the vendored llama.cpp backend. Sets
backend: llama-cpp, embeddings: true, and known_usecases: [token_classify]
so it loads under TOKEN_CLS pooling and is consumed via a model config's
pii.ner.model seam (not a standalone chat/completion model).

The uri points at localai-org/privacy-filter-multilingual-GGUF; the sha256
is the F16 artifact's real hash. The model runs only on a llama.cpp build
carrying the openai-privacy-filter carry-patches in
backend/cpp/llama-cpp/patches/ (the arch is not yet upstream).

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
Invert the PII filtering model so detection policy lives on the NER
(token-classification) detector model itself, and consuming models just
reference detectors by name.

Schema (core/config/model_config.go):
  - New top-level `pii_detection:` block on a detector model
    (min_score, default_action, entity_actions) + accessors.
  - Consumer `pii:` is now `{ enabled, detectors: [<model>...] }`.
  - `pii.patterns` / `pii.ner` kept only as untyped deprecated shadows so
    old YAMLs still parse; Validate() warns (does not fail).

Middleware (core/services/routing/pii):
  - Redactor is now a stateless handle; RedactNER(ctx, text, []NERConfig)
    runs every detector, unions hits, and overlap-merges (block>mask>allow).
  - NERDetectorResolver returns (NERConfig, bool); the resolver reads each
    detector model's pii_detection policy (NERConfigFromRaw).
  - RequestMiddleware is NER-only, multi-detector, fail-closed on a
    detector that can't be resolved or errors.

Regex tier fully removed: patterns.go, config.go (LoadConfig/--pii-config),
the response-side StreamFilter, the /api/pii/{patterns,test,decide,persist}
admin routes, the MCP list/test/set/persist pattern tools, and the dead
--pii-config/--disable-pii AppOptions + runtime_settings overrides. Output/
streaming redaction is dropped for now (NER is request-side only).

Cloud-proxy/MITM now runs NER on the request input (mitm/handler.go gets a
per-host []NERConfig resolved at listener start), fail-closed; the response
is forwarded unmodified.

Capability metadata: pii.detectors (model-multi-select filtered to
token_classify) + pii_detection.* registry entries; config-metadata
autocomplete gains a token_classify case; API instructions rewritten.

UI (React) is a follow-up; this is the Go side.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
- gallery/index.yaml: privacy-filter-multilingual ships a default
  pii_detection policy (validated against the GGUF's real label set —
  mask everything; block PASSWORD/PIN/CVV/CREDITCARD/IBAN/BIC/BANKACCOUNT/
  SSN/{BITCOIN,ETHEREUM,LITECOIN}ADDRESS).
- docs/advanced/model-configuration.md: new "PII filtering" section
  (pii_detection on detector models + pii.detectors on consumers).
- docs/features/middleware.md: rewrote the PII section for the NER-only
  model; dropped the removed regex pattern catalogue / endpoints / MCP
  pattern tools / streaming filter.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
React UI side of the PII redesign.

- New EntityActionListEditor (entity group → mask|block|allow map editor
  for a detector model's pii_detection.entity_actions, with a datalist of
  common categories) and ModelMultiSelect (capability-filtered detector
  picker for a consumer's pii.detectors).
- ConfigFieldRenderer: dispatch `entity-action-list` + `model-multi-select`;
  map `models:token_classify` → FLAG_TOKEN_CLASSIFY; drop `pii-pattern-list`.
- capabilities.js: CAP_TOKEN_CLASSIFY.
- Middleware page: remove the pattern catalogue, the per-pattern action
  editor, and the "Save to disk" persist flow; the per-model table now
  shows the NER detectors each config references. Removed the dead
  pattern-mutation state/handlers.
- modelTemplates: MITM template seeds pii.detectors instead of pii.patterns.
- Deleted PIIPatternListEditor.
- e2e/middleware-page.spec: fixture + tests updated for the detector model;
  removed the PUT /api/pii/patterns test.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
The regex tier and per-consumer NER policy were already removed; the
pii.patterns / pii.ner keys lingered as untyped shadow fields on
PIIConfig only so old YAMLs would parse. But the config-metadata
endpoint reflects over the struct's yaml tags, so those shadows kept
rendering as editable fields in the model-config UI.

Remove them entirely. YAML loading is non-strict (yaml.Unmarshal, no
KnownFields), so a config still carrying these keys has them silently
ignored on load rather than erroring — the keys just stop filtering.
Per the short lifetime of the keys, the warn-on-load migration notice
is dropped along with the fields.

Also drops PIIDeprecatedKeysSet(), the Validate() warning, the now-unused
xlog import, the grandfathered coverage-test entries, and tidies stale
"pattern catalogue" wording on the Middleware page.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
Surface token-classification / NER models as a first-class usecase in
the install-models gallery so users can find PII detector models:

- Add the UsecaseTokenClassify ("token_classify") constant and a
  UsecaseInfoMap entry mapping it to the TokenClassify gRPC RPC; add
  MethodTokenClassify. Declared-only (GuessUsecases never paints it),
  matching the FLAG_SCORE precedent — so ordinary llama-cpp models are
  not auto-tagged.
- List token_classify in llama-cpp's PossibleUsecases so the new filter
  stays enabled when llama-cpp is the selected backend (the privacy
  filter runs on the patched llama.cpp TokenClassify path).
- Wire it into usecaseFilters (gallery capability filter) and add a
  "NER" filter chip to the Models page + en locale string.
- e2e: include token_classify in the llama-cpp backend-usecases mock and
  assert the NER chip stays enabled for llama-cpp.
- gallery: give privacy-filter-multilingual the OpenMed org icon, and
  refresh its description to the current pii.detectors / pii_detection
  model (the old text still referenced pii.ner.model + regex patterns).

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
…t description

The GGUF is published at huggingface.co/LocalAI-io/privacy-filter-multilingual-GGUF
(not localai-org). Fix the url and the file uri so the gallery install
resolves (verified: 200, 2.82 GB).

Also convert the entry description from Markdown to plain text — the gallery
UI renders descriptions as plain text, so the headings/bold/links/backticks
were showing as literal syntax.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
token_classify (NER) models are single-pass encoders: the privacy-filter
detector decodes each window in one forward and pools per-token outputs.
But EffectiveBatchSize only sized the batch to the context for score,
embeddings and rerank — so a token_classify model loaded at the 512
default.

The model sets embeddings:true yet declares known_usecases:[token_classify],
and that declaration is authoritative: it suppresses the embeddings usecase
guess, so HasUsecases(FLAG_EMBEDDINGS) is false. With n_batch left at 512
(and n_outputs_max defaulting to n_batch), the encoder's exact-pass window
shrinks to 512 and longer inputs trip
GGML_ASSERT(n_outputs_max <= cparams.n_outputs_max).

Add FLAG_TOKEN_CLASSIFY to the single-pass set, plus an embeddings-flag
catch-all so any pooled encoder is sized to the context regardless of how
its usecases resolved. Explicit batch: still wins. Covered by two new
options tests mirroring the shipped gallery config (with and without an
explicit context_size).

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
The GGUF importer appends FLAG_CHAT (plus jinja templating) to any model
that loads without an explicit chat template. For a model the operator
reserved for an internal direct-decode primitive — the router score
classifier or the PII NER token_classify tier — the next
syncKnownUsecasesFromString folds that chat flag into KnownUsecases,
which Validate() then rejects as a known_usecases conflict on llama-cpp.
The config is silently skipped at load, so the model disappears from
/v1/models, the system and middleware pages, and the PII detector picker.

Score models escaped this only because their gallery configs carry a
chatml template, so the importer returns early. Templateless detectors
like privacy-filter-multilingual hit it on every load. Guard the chat
defaults behind reservedNonChatModel() so a declared score/token_classify
model keeps its declaration intact.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
buildPIIStatus walked every model config, so VAD, STT, embedding-only and
image models — and the token_classify detector models themselves — showed
up on the Middleware Filtering page as if PII filtering applied to them. It
can't: request-side PII filtering attaches to a text-accepting endpoint
(chat today, plus the cloud-proxy/MITM path), not to arbitrary models.

Add ModelConfig.PIIFilterApplies(), backed by a single piiCoverableUsecases
source of truth, and skip non-applicable configs in buildPIIStatus. Detector
and score models fall out naturally: HasUsecases short-circuits to false for
any usecase a declared token_classify/score model did not itself declare.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
A consuming model only got NER redaction if it explicitly listed
pii.detectors. cloud-proxy models default PIIIsEnabled()==true but carry no
detectors, so the middleware enabled PII and then scanned with nothing —
cloud-proxy/MITM redaction was a no-op out of the box.

Add two instance-wide settings (RuntimeSettings, persisted and round-tripped
through POST /api/settings):

  - PIIDefaultDetectors: token_classify detector models applied to any
    PII-enabled model that names none of its own.
  - PIIDefaultUsecases: model usecases (e.g. FLAG_CHAT) that get PII on by
    default even without a per-model pii.enabled.

Application.ResolvePIIPolicy is the single decision point layering these over
the per-model config (explicit pii.enabled always wins, true or false). The
chat middleware consumes it via the new pii.WithPolicyResolver option (wired
on the OpenAI and Anthropic routes); the MITM listener resolves through it too
so cloud-proxy hosts inherit the global default detector.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
Surface and edit the global PII defaults from the Middleware → Filtering
page. buildPIIStatus now resolves each model through ResolvePIIPolicy, so the
per-model table shows the EFFECTIVE state (global default detector and
default-on usecases applied), labels the source (backend default vs usecase
default vs YAML), and flags detectors inherited from the global default. The
pii status section also returns the current default_detectors,
default_usecases, and the coverable_usecases the selector offers.

The Filtering tab gains a "Default PII policy" editor: a token_classify-
filtered detector multi-select and a checkbox per coverable usecase, saved via
POST /api/settings. config.PIICoverableUsecaseStrings() is the single source of
truth for which usecases are offered (and grows automatically with coverage).

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
… & Ollama

PII filtering attaches to a request shape, not a model, so each text endpoint
needs its own adapter. Add:

  - OpenAICompletion(): scans Prompt / Input / Instruction on *OpenAIRequest
    (string and []any string elements; token-id arrays and other non-strings
    are skipped) — wired on /v1/completions, /v1/embeddings, /v1/edits.
  - OllamaChat / OllamaGenerate / OllamaEmbed: the message content, prompt +
    system, and input/prompt text on /api/chat, /api/generate, /api/embed.

All carry the same WithNERResolver + WithPolicyResolver wiring as chat, so the
per-model and instance-wide default policies apply uniformly. FLAG_COMPLETION,
FLAG_EDIT and FLAG_EMBEDDINGS join piiCoverableUsecases, which automatically
widens both the Middleware list filter and the default-on usecase selector.

Image / TTS / video / rerank and the realtime WebSocket remain documented
follow-ups (different prompt-PII semantics; realtime is not HTTP middleware).

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
…overage

The Middleware PII section still described response/streaming-side redaction
(removed with the regex tier) and only the chat endpoint. Update it to match
the current request-side-only NER filter:

  - Correct the lifecycle diagram and prose: filtering is request-side only;
    the response is no longer touched.
  - List the endpoints that now have a PII adapter wired (chat, completions,
    embeddings, edits, and the Ollama chat/generate/embed routes) and the
    ones still unfiltered (image/audio/video/rerank/realtime).
  - New "Instance-wide defaults" section: the default detector(s) and
    default-on usecases set from the Middleware -> Filtering page (POST
    /api/settings), the ResolvePIIPolicy precedence, and the cloud-proxy
    no-op-without-a-detector gap they close.
  - Refresh the Admin-page paragraph: the Default PII policy editor, the
    list now scoped to PII-coverable models, and the effective-state /
    source / (default)-detector columns.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
…ering page

A model can resolve PII-enabled (cloud-proxy backend default, or a default-on
usecase) yet end up with no detector — no pii.detectors of its own and no
instance-wide default detector set. The request middleware then passes through
unscanned, but the per-model table showed a plain green "on" badge, reading as
protected when nothing is actually filtered.

Add a "no-op" warning chip next to the enabled badge whenever a row is enabled
with an empty resolved detector list, with a tooltip pointing at the fix (set a
default detector, or add pii.detectors to the model). e2e asserts the chip
appears for the detector-less cloud-proxy model and not for one that lists a
detector.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
…ayout

ModelMultiSelect reused SearchableModelSelect for its "add" row, but that input
commits onChange on every keystroke (single-value editors rely on it). The add
row treated each onChange as a final selection, so typing a detector name
appended one bogus entry per character. It also rendered every selected model as
its own full-width input row, wasting vertical space.

  - SearchableModelSelect: add a commitOnly prop. When set, onChange fires only
    on an explicit commit (selecting an item or pressing Enter), and the field
    clears after — never on a partial keystroke. Default false preserves the
    as-you-type behaviour the single-value callers depend on.
  - ModelMultiSelect: render selected models as compact removable chips and add
    via one commit-only picker, removing the keystroke spam and the stacked
    input boxes.

Both ModelMultiSelect callers (pii.detectors in the model editor and the
instance-wide default detector on the Middleware page) get the fix. e2e now
seeds a default detector and asserts the chip + its remove control render.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
… PII defaults

Two follow-ups on the Default PII policy editor:

  - The add picker was styled flex: '1 1 260px', but ModelMultiSelect lays its
    children out in a flex *column*, so that flex-basis set the picker
    wrapper's HEIGHT to 260px. The dropdown anchors to the wrapper bottom
    (top: 100%), so it opened ~225px below the input. Size by width only; the
    list now sits flush under the box (measured 2px).
  - The default-on usecase options used raw <input type="checkbox">. Swap them
    for the app's standard Toggle switch so they match the rest of the UI.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
…y default-on

The instance-wide "enable PII by default for these model types" selector added
a second, confusing way to turn filtering on alongside the existing cloud-proxy
default. In practice the useful default is simply: cloud models (which leave the
instance) are on, everything else is per-model opt-in. Remove the usecase
mechanism and keep that.

  - Drop PIIDefaultUsecases (RuntimeSettings + ApplicationConfig + the
    ResolvePIIPolicy usecase loop) and PIICoverableUsecaseStrings. piiCoverableUsecases
    / PIIFilterApplies stay — they still scope the Middleware model list.
  - buildPIIStatus no longer emits default_usecases / coverable_usecases /
    default_for_usecase.
  - Middleware UI: the Default PII policy editor is now just the default
    detector picker; remove the usecase toggles, humanizeUsecase, and the
    "usecase default" row source.
  - Update the ResolvePIIPolicy tests, the e2e mock/assertions, and the docs.

Enablement precedence is now: explicit pii.enabled wins; otherwise the backend
default (cloud-proxy) decides. The instance-wide default detector is unchanged
and still what makes cloud-proxy/MITM redaction work out of the box.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
…ce, debug logs

There was no way to see what the NER detector actually produced, so a
false-positive block (e.g. a phone number scored as SSN) was opaque. Add three
views, all from data the TokenClassify gRPC already returns (no backend rebuild):

  - Backend trace: ModelTokenClassify now records a BackendTraceTokenClassify
    row (gated on tracing) with the input preview, threshold, and every entity's
    group, byte range, confidence and matched text. Wires up the long-standing
    TODO; shows in the Traces UI alongside the request it gated.
  - Confidence in the audit log: carry the detector score through
    rawHit -> Span -> PIIEvent, exposed as `score` on /api/pii/events. Metadata
    only — the event still stores a hash, never the value.
  - Per-detection DEBUG logs in the redactor: one line per raw hit with group,
    range, score, matched text and the policy decision (accepted / dropped
    below min_score / no action for group), so the masking/blocking rationale is
    visible in the backend logs.

Also drop a redundant same-type assertion in ModelTokenClassify (Load already
returns grpc.Backend) and give TokenEntity json tags for clean trace rendering.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
…tching

NER is the wrong tool for high-entropy, highly-regular secrets: the privacy
model has no API-key class, so it fragments a key (tail -> BITCOINADDRESS,
prefix -> VRM) and can leave the secret part exposed. Add a regex tier that
plugs in as another detector model alongside NER and reuses the whole pipeline.

  - core/services/routing/piipattern (leaf; stdlib only): a restricted-regex
    grammar validated against the RE2 AST (no '.', no capture groups, capped
    {n,m}; every pattern must carry a >=3-char fixed literal anchor, which
    admits sk-ant-/ghp_/AKIA shapes but rejects open-ended ones like email or
    bare \w+), compiled to RE2 (linear, no backtracking) with leftmost-longest
    so a hit grabs the whole key. Curated built-in catalogue (Anthropic, OpenAI,
    GitHub, AWS, Google, Slack, Stripe, JWT, PEM private key).
  - config: PIIDetection.{Builtins,Patterns} + IsPatternDetector(); Validate
    rejects bad patterns/unknown builtins at load (no model file required).
  - piidetector.NewPattern: in-process pii.NERDetector (Score 1.0), records a
    pattern_pii BackendTrace when tracing is on. PIINERResolver branches to it
    for pattern models; MITM inherits it. Per-pattern action overrides fold
    into entity_actions.
  - meta registry: pii_detection.builtins -> pii-builtins-select (options from
    the catalogue) and pii_detection.patterns -> pii-pattern-list, for the
    model editor.
  - gallery: ready-made secret-filter pattern model (builtins on, default
    block, zero VRAM). Docs: new Pattern detector tier section.

Pattern hits union with NER hits and flow through the same policy, events
(score 1.0) and DEBUG logs. UI editors + Traces badge follow in a UI commit.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
Add the React UI for the pattern detector tier:

- PatternListEditor: add/remove rows of {name, match, action, min_len}
  with a monospace match input and a restricted-grammar hint. Server-side
  Validate is authoritative, so no regex engine ships to the client.
- ConfigFieldRenderer: two branches keyed on field.component —
  pii-builtins-select (checkbox list of built-in secret patterns) and
  pii-pattern-list (PatternListEditor).
- Traces: distinct badge colors for the pattern_pii and token_classify
  backend trace types.
- e2e: built-in checklist + custom pattern-list editor specs.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
Replace the default-detector model chooser on the PII/Filtering tab with a
table of every token-classify detector model (NER and pattern). Each row
has a "default detector" toggle (persisted to pii_default_detectors via
settings), an Edit link to the model config, and the table offers an
"Add detector model" action seeded from the secret-filter template.
Detectors named as defaults but not loaded are shown as "not loaded".

The per-model state table's PII column becomes an inline toggle that
PATCHes pii.enabled for that model.

buildPIIStatus now returns detector_models so the UI can render the table.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
…he editor

Two coupled fixes so detector entity_actions (and other map fields: roles,
engine_args) can be viewed and edited in the model config UI:

- Editor: flattenConfig stopped recursing past registered map leaves, so a
  populated entity_actions block now renders with its rows instead of being
  flattened into invisible dotted scalar paths. The load effect is split into
  a fetch and a derive step so a late metadata arrival re-flattens without a
  second fetch, and useConfigMetadata returns stable empty slices.

- PatchConfig: replace the mergo deep-merge (which unions maps and never
  deletes keys) with a metadata-driven patchMerge that overwrites map-typed
  leaves wholesale. Removing an entity action in the editor and saving now
  drops it from the YAML; emptying the map removes the field entirely.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
Pattern-matcher hits were stored and masked with an "ner:" prefix even
though no NER (Named Entity Recognition) model was involved, because the
redactor hard-coded the prefix for every detector. Thread a Source through
NERConfig (SourceNER / SourcePattern; empty defaults to ner for
back-compat) and build the synthetic id from it via NERConfig.patternID.

Pattern detections now carry pattern:<GROUP> ids and [REDACTED:pattern:<GROUP>]
masks; NER detections stay ner:<GROUP>. The resolver tags each detector with
its source, and the doc strings / swagger / api-instructions examples are
updated to match.

Assisted-by: claude-code:claude-opus-4-8 [Claude Code]
@richiejp richiejp force-pushed the feat/pii-ner-tier branch from 32d5dbc to e917189 Compare June 8, 2026 08:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant