Skip to content

Harden privacy-filter remote-code allowlist for 1.5.2#59

Merged
maziyarpanahi merged 14 commits into
masterfrom
security/model-allowlist
May 27, 2026
Merged

Harden privacy-filter remote-code allowlist for 1.5.2#59
maziyarpanahi merged 14 commits into
masterfrom
security/model-allowlist

Conversation

@maziyarpanahi

Copy link
Copy Markdown
Owner

Summary

This PR prepares the 1.5.2 security release branch after the merged MLX converter fix in #58.

It hardens the privacy-filter loading path so attacker-controlled Hugging Face repository names that merely contain privacy-filter no longer route through a loader path that can enable trust_remote_code=True.

What Changed

  • Added an explicit first-party remote-code allowlist for the privacy-filter family:
    • openai/privacy-filter
    • OpenMed/privacy-filter-multilingual
    • OpenMed/privacy-filter-nemotron
  • Changed PrivacyFilterTorchPipeline so trust_remote_code defaults to False.
  • Updated create_privacy_filter_pipeline() to opt in to trust_remote_code=True only after resolving the actual fallback model and checking the allowlist.
  • Tightened privacy-filter identifier matching so arbitrary names such as attacker/foo-privacy-filter-bar no longer route through the privacy-filter dispatcher.
  • Added OPENMED_TRUSTED_REMOTE_CODE_MODELS as an operator escape hatch for controlled/private fine-tunes.
  • Added security regression coverage at both unit and HTTP-service levels.
  • Bumped public release/version surfaces to 1.5.2.
  • Updated CHANGELOG.md for the full 1.5.2 release, including the previously merged fix: add MLX weight remapping for openai_privacy_filter / nemotron architecture #58 MLX conversion fix.

Root Cause

The privacy-filter dispatcher previously identified privacy-filter models with a substring match. That meant any model name containing privacy-filter could reach the privacy-filter-specific path. The PyTorch privacy-filter wrapper also defaulted trust_remote_code=True, which is required for first-party OpenAI/OpenMed privacy-filter repos but unsafe for arbitrary repositories.

This PR separates two concerns:

  • routing: only first-party privacy-filter identifiers and local privacy-filter artifacts are routed as privacy-filter-family requests;
  • remote code execution: only allowlisted repositories or operator-controlled local/env-configured models may opt in to trust_remote_code=True.

Release Notes

1.5.2 now includes both:

Validation

  • python -m pytest
    • 1194 passed, 1 skipped, 15 warnings
  • python scripts/release/check_repo_policy.py
    • passed

The warnings are existing deprecation/span-validation warnings and are not introduced by this change.

@maziyarpanahi maziyarpanahi self-assigned this May 27, 2026
@maziyarpanahi maziyarpanahi marked this pull request as ready for review May 27, 2026 09:28
@maziyarpanahi maziyarpanahi merged commit 98724f6 into master May 27, 2026
13 checks passed
@maziyarpanahi maziyarpanahi deleted the security/model-allowlist branch May 27, 2026 10:08
maziyarpanahi added a commit that referenced this pull request May 27, 2026
Harden privacy-filter remote-code allowlist for 1.5.2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant