maziyarpanahi
diff --git a/‎.gitignore‎
Lines changed: 1 addition & 0 deletions b/‎.gitignore‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎CHANGELOG.md‎
Lines changed: 34 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 34 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 19 additions & 11 deletions b/‎README.md‎
Lines changed: 19 additions & 11 deletions
diff --git a/‎RELEASE_NOTES_v1.4.0.md‎
Lines changed: 85 additions & 0 deletions b/‎RELEASE_NOTES_v1.4.0.md‎
Lines changed: 85 additions & 0 deletions
diff --git a/‎docs/anonymization.md‎
Lines changed: 13 additions & 8 deletions b/‎docs/anonymization.md‎
Lines changed: 13 additions & 8 deletions
diff --git a/‎docs/examples.md‎
Lines changed: 2 additions & 2 deletions b/‎docs/examples.md‎
Lines changed: 2 additions & 2 deletions
@@ -221,6 +221,7 @@ local_config.py
 
 # Personal release/announcement drafts
 /RELEASE_NOTES*.md
+!/RELEASE_NOTES_v1.4.0.md
 /ANNOUNCEMENT*.md
 local_settings.py
 dev_config.json
 
@@ -7,6 +7,40 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
+## [1.4.0] - 2026-05-04
+
+### Added
+
+- **OpenMed Multilingual Privacy Filter family**, registered across PyTorch and MLX:
+  - `OpenMed/privacy-filter-multilingual` — PyTorch / Transformers (CPU + CUDA).
+  - `OpenMed/privacy-filter-multilingual-mlx` — MLX full-precision (Apple Silicon).
+  - `OpenMed/privacy-filter-multilingual-mlx-8bit` — MLX 8-bit quantized (Apple Silicon and OpenMedKit demos).
+  These artifacts use the OpenAI Privacy Filter architecture and officially support 16 languages through the OpenMed multilingual PII corpus.
+- **Python MLX routing for multilingual Privacy Filter artifacts**:
+  - `_MLX_MODEL_MAP` entries for the full and 8-bit multilingual MLX repo IDs.
+  - `privacy-filter-multilingual` and `multilingual-privacy-filter` MLX family aliases, both resolving to the existing OpenAI Privacy Filter model class and BIOES decoder.
+  - Family-aware Torch fallback so multilingual MLX model names substitute `OpenMed/privacy-filter-multilingual` on non-MLX hosts instead of the OpenAI baseline.
+- **Multilingual Privacy Filter Studio** in `examples/privacy_filter_multilingual_studio/`, a web demo comparing the OpenAI baseline, OpenAI Nemotron Privacy Filter, and OpenMed Multilingual Privacy Filter with English, French, and Arabic examples.
+- **OpenMed Scan Demo multilingual mode** with `OpenMed/privacy-filter-multilingual-mlx-8bit`, a three-engine picker, EN/FR/AR sample buttons, and new French/Arabic scanned demo documents for screenshot-ready flows.
+- **Release notes** for v1.4.0 in `RELEASE_NOTES_v1.4.0.md`.
+
+### Changed
+
+- Privacy Filter docs and README now describe three Privacy Filter families and label the multilingual model as **OpenMed Multilingual Privacy Filter**.
+- OpenMedKit and demo version surfaces now point at `1.4.0`.
+- The scan demo clears previous annotation windows whenever the language/sample changes, avoiding stale entities from earlier model runs.
+- The multilingual web studio scan animation now performs a single top-to-bottom pass while redacting line by line, matching the stronger visual rhythm of the original Privacy Filter Studio.
+
+### Fixed
+
+- Improved Swift model-download handling so stale cached 401/404 responses cannot masquerade as `openmed-mlx.json` manifests after a public model becomes available.
+- Tightened stale-result invalidation in iOS and web demo flows so slower previous model runs cannot overwrite a newly selected language/sample.
+
+### Tests
+
+- Added Python unit coverage for multilingual MLX backend selection, family-aware Torch fallback, and MLX Privacy Filter family dispatch aliases.
+- Rebuilt the OpenMed Scan Demo after the multilingual 8-bit integration.
+
 ## [1.3.0] - 2026-04-27
 
 ### Added
 
@@ -56,16 +56,16 @@ Apple Silicon acceleration in Python:
 uv pip install -e ".[mlx]"
 ```
 
-Swift apps on macOS and iOS use `OpenMedKit`. In `1.2.0`, that means:
+Swift apps on macOS and iOS use `OpenMedKit`. As of `1.4.0`, that means:
 
-- **MLX** on Apple Silicon macOS and real iPhone/iPad hardware for supported OpenMed PII, OpenAI Privacy Filter, and experimental GLiNER-family artifacts
+- **MLX** on Apple Silicon macOS and real iPhone/iPad hardware for supported OpenMed PII, OpenAI Privacy Filter, OpenAI Nemotron Privacy Filter, OpenMed Multilingual Privacy Filter, and experimental GLiNER-family artifacts
 - **CoreML** when you already have a bundled Apple model package or want the fallback Apple path
 
 Add the Swift package like this:
 
 ```swift
 dependencies: [
-    .package(url: "https://github.com/maziyarpanahi/openmed.git", from: "1.2.0"),
+    .package(url: "https://github.com/maziyarpanahi/openmed.git", from: "1.4.0"),
 ]
 ```
 
@@ -121,7 +121,7 @@ result = processor.process_texts([
 - **Advanced NER Processing**: Confidence filtering, entity grouping, and span alignment
 - **Multiple Output Formats**: Dict, JSON, HTML, CSV for any downstream system
 
-### Production Tools (v1.2.0)
+### Production Tools (v1.4.0)
 
 - **Batch Processing**: Multi-text and multi-file workflows with progress tracking
 - **Configuration Profiles**: `dev`/`prod`/`test`/`fast` presets with flexible overrides
@@ -176,8 +176,8 @@ uvicorn openmed.service.app:app --host 0.0.0.0 --port 8080
 ### Run with Docker
 
 ```bash
-docker build -t openmed:1.2.0 .
-docker run --rm -p 8080:8080 -e OPENMED_PROFILE=prod openmed:1.2.0
+docker build -t openmed:1.4.0 .
+docker run --rm -p 8080:8080 -e OPENMED_PROFILE=prod openmed:1.4.0
 ```
 
 ### Example request
@@ -262,15 +262,18 @@ deidentify(text, method="replace", lang="pt", locale="pt_BR",
 
 ### Privacy Filter Family (Public)
 
-OpenMed ships **two checkpoints** of the OpenAI Privacy Filter architecture — same model code (gpt-oss-style sparse-MoE transformer with local attention, sink tokens, RoPE+YaRN, tiktoken `o200k_base` tokenization), different training data:
+OpenMed ships **three Privacy Filter families** on the OpenAI Privacy Filter architecture — same model code (gpt-oss-style sparse-MoE transformer with local attention, sink tokens, RoPE+YaRN, tiktoken `o200k_base` tokenization), different training data:
 
-| Variant                | Trained on                                                                         | PyTorch (CPU + CUDA)                  | MLX full (Apple Silicon)                          | MLX 8-bit (Apple Silicon)                              |
-| ---------------------- | ---------------------------------------------------------------------------------- | ------------------------------------- | ------------------------------------------------- | ------------------------------------------------------ |
-| OpenAI Privacy Filter  | OpenAI's PII training set                                                          | [`openai/privacy-filter`](https://huggingface.co/openai/privacy-filter)              | [`OpenMed/privacy-filter-mlx`](https://huggingface.co/OpenMed/privacy-filter-mlx)                | [`OpenMed/privacy-filter-mlx-8bit`](https://huggingface.co/OpenMed/privacy-filter-mlx-8bit)                |
-| Nemotron-PII fine-tune | [Nemotron PII dataset](https://huggingface.co/datasets/nvidia/Nemotron-PII-v1)     | [`OpenMed/privacy-filter-nemotron`](https://huggingface.co/OpenMed/privacy-filter-nemotron)     | [`OpenMed/privacy-filter-nemotron-mlx`](https://huggingface.co/OpenMed/privacy-filter-nemotron-mlx)       | [`OpenMed/privacy-filter-nemotron-mlx-8bit`](https://huggingface.co/OpenMed/privacy-filter-nemotron-mlx-8bit)       |
+| Variant                              | Trained on                                                                     | PyTorch (CPU + CUDA)                                                     | [MLX full (OpenMedKit + Apple Silicon)](swift/OpenMedKit)                             | [MLX 8-bit (OpenMedKit + Apple Silicon)](swift/OpenMedKit)                                 |
+| ------------------------------------ | ------------------------------------------------------------------------------ | ------------------------------------------------------------------------ | ------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------ |
+| OpenAI Privacy Filter                | OpenAI's PII training set                                                      | [`openai/privacy-filter`](https://huggingface.co/openai/privacy-filter)  | [`OpenMed/privacy-filter-mlx`](https://huggingface.co/OpenMed/privacy-filter-mlx)     | [`OpenMed/privacy-filter-mlx-8bit`](https://huggingface.co/OpenMed/privacy-filter-mlx-8bit) |
+| Nemotron-PII fine-tune               | [Nemotron PII dataset](https://huggingface.co/datasets/nvidia/Nemotron-PII-v1) | [`OpenMed/privacy-filter-nemotron`](https://huggingface.co/OpenMed/privacy-filter-nemotron) | [`OpenMed/privacy-filter-nemotron-mlx`](https://huggingface.co/OpenMed/privacy-filter-nemotron-mlx) | [`OpenMed/privacy-filter-nemotron-mlx-8bit`](https://huggingface.co/OpenMed/privacy-filter-nemotron-mlx-8bit) |
+| OpenMed Multilingual Privacy Filter  | OpenMed multilingual PII corpus with official support for 16 languages         | [`OpenMed/privacy-filter-multilingual`](https://huggingface.co/OpenMed/privacy-filter-multilingual) | [`OpenMed/privacy-filter-multilingual-mlx`](https://huggingface.co/OpenMed/privacy-filter-multilingual-mlx) | [`OpenMed/privacy-filter-multilingual-mlx-8bit`](https://huggingface.co/OpenMed/privacy-filter-multilingual-mlx-8bit) |
 
 All model IDs above route through the **same** `extract_pii()` / `deidentify()` API — only the `model_name=` argument changes.
 
+The MLX artifacts above use the OpenMed MLX artifact layout consumed by [OpenMedKit](swift/OpenMedKit) for native macOS, iOS, and iPadOS apps.
+
 #### Install
 
 The PyTorch path runs anywhere (Linux, macOS, Windows; CPU or CUDA):
@@ -321,6 +324,10 @@ extract_pii(text, model_name="OpenMed/privacy-filter-mlx-8bit")
 # Nemotron-PII fine-tune (full / 8-bit MLX artifacts)
 extract_pii(text, model_name="OpenMed/privacy-filter-nemotron-mlx")
 extract_pii(text, model_name="OpenMed/privacy-filter-nemotron-mlx-8bit")
+
+# OpenMed Multilingual Privacy Filter (full / 8-bit MLX artifacts)
+extract_pii(text, model_name="OpenMed/privacy-filter-multilingual-mlx")
+extract_pii(text, model_name="OpenMed/privacy-filter-multilingual-mlx-8bit")
 ```
 
 #### Cross-platform note
@@ -329,6 +336,7 @@ The MLX artifact names work everywhere — on a non-Apple-Silicon host (or anywh
 
 - `OpenMed/privacy-filter-mlx*` ⇒ falls back to `openai/privacy-filter`
 - `OpenMed/privacy-filter-nemotron-mlx*` ⇒ falls back to `OpenMed/privacy-filter-nemotron`
+- `OpenMed/privacy-filter-multilingual-mlx*` ⇒ falls back to `OpenMed/privacy-filter-multilingual`
 
 So your code can ship an MLX model name and run on any host without changes — Apple Silicon users get MLX speed, everyone else gets the same family's PyTorch checkpoint.
 
 
@@ -0,0 +1,85 @@
+# OpenMed v1.4.0
+
+OpenMed v1.4.0 is the multilingual Privacy Filter release.
+
+This release brings the **OpenMed Multilingual Privacy Filter** into the main OpenMed ecosystem across Python, MLX, OpenMedKit, the iOS Scan Demo, and the web demo experience. The new family officially supports 16 languages and ships in PyTorch, MLX full-precision, and MLX 8-bit forms.
+
+The headline: developers can now use the same `extract_pii()` / `deidentify()` API for the OpenAI baseline, OpenAI Nemotron Privacy Filter, and OpenMed Multilingual Privacy Filter, while Apple demos can showcase all three model choices without changing application code.
+
+## Highlights
+
+- Added the OpenMed Multilingual Privacy Filter model family:
+  - `OpenMed/privacy-filter-multilingual`
+  - `OpenMed/privacy-filter-multilingual-mlx`
+  - `OpenMed/privacy-filter-multilingual-mlx-8bit`
+- Added Python MLX routing for the multilingual full and 8-bit artifacts.
+- Added family-aware fallback so multilingual MLX names resolve to the multilingual PyTorch checkpoint on non-MLX hosts.
+- Added MLX family aliases for multilingual Privacy Filter artifacts that reuse the existing OpenAI Privacy Filter runtime and BIOES decoder.
+- Updated the OpenMed Scan Demo with the 8-bit multilingual model, a clearer three-model picker, and EN/FR/AR sample buttons.
+- Added French and Arabic scanned demo documents for screenshot-ready multilingual flows.
+- Added a multilingual web studio that compares the OpenAI baseline, OpenAI Nemotron Privacy Filter, and OpenMed Multilingual Privacy Filter.
+- Updated README, anonymization docs, MLX docs, Swift docs, CHANGELOG, and version surfaces for `1.4.0`.
+
+## Privacy Filter Families
+
+OpenMed now documents and routes three Privacy Filter families:
+
+| Variant | PyTorch | MLX full | MLX 8-bit |
+| --- | --- | --- | --- |
+| OpenAI Privacy Filter | `openai/privacy-filter` | `OpenMed/privacy-filter-mlx` | `OpenMed/privacy-filter-mlx-8bit` |
+| OpenAI Nemotron Privacy Filter | `OpenMed/privacy-filter-nemotron` | `OpenMed/privacy-filter-nemotron-mlx` | `OpenMed/privacy-filter-nemotron-mlx-8bit` |
+| OpenMed Multilingual Privacy Filter | `OpenMed/privacy-filter-multilingual` | `OpenMed/privacy-filter-multilingual-mlx` | `OpenMed/privacy-filter-multilingual-mlx-8bit` |
+
+All three families use the OpenAI Privacy Filter architecture. The multilingual family uses OpenMed multilingual PII training data and officially supports 16 languages.
+
+## Python Usage
+
+The public API stays the same:
+
+```python
+from openmed import extract_pii, deidentify
+
+text = "Patient Marie Dubois, nee le 14/03/1982, email marie.dubois@example.fr."
+
+entities = extract_pii(
+    text,
+    model_name="OpenMed/privacy-filter-multilingual-mlx-8bit",
+)
+
+safe = deidentify(
+    text,
+    model_name="OpenMed/privacy-filter-multilingual-mlx-8bit",
+    method="replace",
+    consistent=True,
+    seed=42,
+)
+```
+
+On Apple Silicon with MLX available, the MLX artifact runs through `PrivacyFilterMLXPipeline`. On other hosts, OpenMed substitutes the matching PyTorch checkpoint and emits a one-time warning:
+
+- `OpenMed/privacy-filter-mlx*` -> `openai/privacy-filter`
+- `OpenMed/privacy-filter-nemotron-mlx*` -> `OpenMed/privacy-filter-nemotron`
+- `OpenMed/privacy-filter-multilingual-mlx*` -> `OpenMed/privacy-filter-multilingual`
+
+## Apple And Demo Updates
+
+The iOS Scan Demo now presents three privacy engines cleanly:
+
+- OpenMed PII
+- OpenAI Nemotron Privacy Filter
+- OpenMed Multilingual Privacy Filter
+
+The multilingual path uses `OpenMed/privacy-filter-multilingual-mlx-8bit` so the demo stays aligned with the 8-bit Apple artifact strategy. The sample controls now use compact `EN`, `FR`, and `AR` buttons, and switching language/sample clears previous annotations before the next run starts.
+
+The multilingual web studio now uses a single top-to-bottom scan pass and redacts line by line during that pass, matching the original Privacy Filter Studio demo feel without looping the scan effect.
+
+## Upgrade Notes
+
+- The package version is now `1.4.0`.
+- Swift demo marketing versions are now `1.4.0`.
+- `OpenMed/privacy-filter-multilingual-mlx` and `OpenMed/privacy-filter-multilingual-mlx-8bit` are first-class model names in the MLX routing table.
+- The multilingual MLX artifacts must include a valid `openmed-mlx.json`; stale cached HTTP error bodies are no longer treated as manifests by the scan demo downloader.
+
+## Validation
+
+This release adds targeted unit coverage for multilingual Privacy Filter routing, MLX family alias dispatch, and family-aware fallback behavior. The OpenMed Scan Demo was also rebuilt after the multilingual 8-bit integration.
@@ -127,34 +127,39 @@ register_label_generator("FIRST_NAME", my_first_name)
 
 ## Privacy-filter family
 
-OpenMed ships two privacy-filter checkpoints, both **the same OpenAI
+OpenMed ships three privacy-filter families, all **the same OpenAI
 Privacy Filter architecture** (gpt-oss-style sparse-MoE transformer with
 local attention, sink tokens, RoPE+YaRN, tiktoken `o200k_base`), differing
 only in their training data:
 
-| Variant                   | Trained on                  | PyTorch artifact                     | MLX (full)                                  | MLX (8-bit)                                       |
-| ------------------------- | --------------------------- | ------------------------------------ | ------------------------------------------- | ------------------------------------------------- |
-| OpenAI Privacy Filter     | OpenAI's PII training set   | `openai/privacy-filter`              | `OpenMed/privacy-filter-mlx`                | `OpenMed/privacy-filter-mlx-8bit`                 |
-| Nemotron-PII fine-tune    | Nemotron PII dataset        | `OpenMed/privacy-filter-nemotron`    | `OpenMed/privacy-filter-nemotron-mlx`       | `OpenMed/privacy-filter-nemotron-mlx-8bit`        |
+| Variant                              | Trained on                                      | PyTorch artifact                         | MLX (full)                                      | MLX (8-bit)                                           |
+| ------------------------------------ | ----------------------------------------------- | ---------------------------------------- | ----------------------------------------------- | ----------------------------------------------------- |
+| OpenAI Privacy Filter                | OpenAI's PII training set                       | `openai/privacy-filter`                  | `OpenMed/privacy-filter-mlx`                    | `OpenMed/privacy-filter-mlx-8bit`                     |
+| OpenAI Nemotron Privacy Filter       | Nemotron PII dataset                            | `OpenMed/privacy-filter-nemotron`        | `OpenMed/privacy-filter-nemotron-mlx`           | `OpenMed/privacy-filter-nemotron-mlx-8bit`            |
+| OpenMed Multilingual Privacy Filter  | OpenMed multilingual PII corpus, 16 languages   | `OpenMed/privacy-filter-multilingual`    | `OpenMed/privacy-filter-multilingual-mlx`       | `OpenMed/privacy-filter-multilingual-mlx-8bit`        |
 
-Both run through the same `extract_pii()` / `deidentify()` API — only the
+All run through the same `extract_pii()` / `deidentify()` API — only the
 weights differ:
 
 ```python
 extract_pii(text, model_name="OpenMed/privacy-filter-mlx-8bit")
 extract_pii(text, model_name="OpenMed/privacy-filter-nemotron-mlx-8bit")
+extract_pii(text, model_name="OpenMed/privacy-filter-multilingual-mlx-8bit")
 
 deidentify(text, model_name="OpenMed/privacy-filter-nemotron",
            method="replace", consistent=True, seed=42)
+deidentify(text, model_name="OpenMed/privacy-filter-multilingual",
+           method="replace", consistent=True, seed=42)
 ```
 
 **Backend selection.** On Apple Silicon with MLX importable, the MLX
 artifact runs natively via `PrivacyFilterMLXPipeline`. Elsewhere, the
 call substitutes the corresponding PyTorch model via `transformers` and
 emits a one-time `UserWarning` explaining the swap. The fallback is
 **family-aware** — an MLX-only Nemotron request on Linux substitutes
-`OpenMed/privacy-filter-nemotron` (not the unrelated `openai/privacy-filter`),
-so the user gets the same training distribution they asked for.
+`OpenMed/privacy-filter-nemotron`, and an MLX-only multilingual request
+substitutes `OpenMed/privacy-filter-multilingual`, so the user gets the same
+training distribution they asked for.
 
 Either way the output entity dicts have the same shape so the rest of
 the pipeline behaves identically. Smart-merging (regex-based span
 
@@ -24,9 +24,9 @@ Run them with VS Code, Jupyter, or Google Colab—each relies on the same `uv pi
 
 ## Apple Silicon & Swift recipes
 
-OpenMed `1.2.0` adds release-critical Apple entry points:
+OpenMed `1.4.0` includes release-critical Apple entry points:
 
-- [MLX Backend](./mlx-backend.md) for Python on Apple Silicon Macs, including Privacy Filter and experimental GLiNER-family artifacts
+- [MLX Backend](./mlx-backend.md) for Python on Apple Silicon Macs, including Privacy Filter, OpenMed Multilingual Privacy Filter, and experimental GLiNER-family artifacts
 - [OpenMedKit (Swift Package)](./swift-openmedkit.md) for macOS, iOS, and iPadOS apps
 
 Python MLX quick check: