fix: add MLX weight remapping for openai_privacy_filter / nemotron architecture#58
Merged
maziyarpanahi merged 2 commits intoMay 27, 2026
Conversation
…) architecture The openai_privacy_filter model family (including privacy-filter-nemotron) uses a different weight namespace than other OpenMed MLX models. Without explicit conversion, all 140 parameters are rejected as "not in model". Changes: - Add _convert_opf_weights() to handle the HF → MLX namespace mapping: - score.* → unembedding.* - model.layers.N.* → block.N.* - Separate q/k/v_proj → fused attn.qkv (QKV fusion via concatenation) - input_layernorm.weight → attn.norm.scale (RMSNorm rename) - mlp.router.* → mlp.gate.* - mlp.experts.gate_up_proj → mlp.swiglu.weight (no transpose — HF stores [E,in,out]) - mlp.experts.down_proj → mlp.out.weight (no transpose) - Set classifier_bias=True in config when score.bias is present in the HF state dict, so the MLX model allocates the unembedding bias parameter. - Wire _convert_opf_weights() into convert_weights() for model_type values "openai-privacy-filter", "privacy-filter-nemotron", "nemotron-privacy-filter". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Owner
|
Thanks again for chasing this down. I pushed one follow-up commit onto this PR so it now carries the PR #57 fix plus the OPF/Nemotron remapping in one place. What changed in the follow-up:
The reason this failed for local conversion but not for the existing OpenMed demo is that the demo uses the already-exported I also ran the full suite locally on the updated PR branch: |
maziyarpanahi
approved these changes
May 25, 2026
maziyarpanahi
added a commit
that referenced
this pull request
May 27, 2026
fix: add MLX weight remapping for openai_privacy_filter / nemotron architecture
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The
openai_privacy_filtermodel family (e.g.OpenMed/privacy-filter-nemotron) cannot be loaded viacreate_mlx_pipelinebecause its HuggingFace weight namespace does not match the MLX model's namespace. All 140 parameters are rejected withValueError: Received 140 parameters not in model.Root cause
The existing
remap_key()function handles BERT/RoBERTa/DeBERTa/DistilBERT/ELECTRA architectures but has no case foropenai_privacy_filter. The OPF architecture differs from all others:score.*→unembedding.*)model.layers.N.*→block.N.*)q/k/v_proj, MLX has singleattn.qkvinput_layernorm→attn.norm,router→gate)scalenotweight[E, in, out]format (no transpose needed)Fix
Add
_convert_opf_weights()and wire intoconvert_weights()for model typesopenai-privacy-filter,privacy-filter-nemotron,nemotron-privacy-filter. Also setclassifier_bias: Truein the MLX config whenscore.biasis present.Testing
Verified on
OpenMed/privacy-filter-nemotron— model now loads and runs inference correctly:🤖 Generated with Claude Code