Skip to content

feat: add cardiology domain to NER catalog and zero-shot label maps#551

Merged
maziyarpanahi merged 1 commit into
maziyarpanahi:masterfrom
pardeep-singh:pardeep/issue317-cardiology-domain
Jun 21, 2026
Merged

feat: add cardiology domain to NER catalog and zero-shot label maps#551
maziyarpanahi merged 1 commit into
maziyarpanahi:masterfrom
pardeep-singh:pardeep/issue317-cardiology-domain

Conversation

@pardeep-singh

@pardeep-singh pardeep-singh commented Jun 21, 2026

Copy link
Copy Markdown
Contributor

Implements #317 (OM-152): adds a cardiology domain to the zero-shot default label map plus cardiology keyword routing, so cardiology notes (ECG findings, ejection fraction, cardiac procedures/devices) stop falling back to generic clinical labels.

Per the discussion on the issue, this follows 1(a) (routing helper, get_model_suggestions behavior unchanged) and 2(a) (display labels consistent with existing domains; no new clinical canonical-label system).

Changes

  • openmed/zero_shot/data/label_maps/defaults.json — new cardiology domain using the existing letters-only display-label style: CardiacFinding, ECGFinding, EjectionFraction, CardiacProcedure, CardiacDevice, Anatomy.
  • openmed/core/model_registry.py
    • Extracted the keyword->category matching out of get_model_suggestions into a _match_categories(text) helper that returns (category, reason) pairs, so cardiology text routes to the Cardiology category independently of whether a Cardiology model is registered.
    • get_model_suggestions() user-facing behavior is unchanged: with no Cardiology model registered, cardiology text still falls back to general medical suggestions rather than surfacing unrelated results.
    • Added _CATEGORY_ENTITY_TYPES["Cardiology"] as forward registry metadata for future Cardiology models (none exists today).
  • tests/unit/ner/test_label_map_consistency.py — tests for available_domains() / get_default_labels("cardiology"), non-empty labels, no duplicates, a label-style-consistency check across all domains, _match_categories routing for the echocardiogram example, and that get_model_suggestions behavior is unchanged.

openmed/core/labels.py is intentionally untouched — per the issue discussion, the strict CANONICAL_LABELS taxonomy there is the PII/PHI policy taxonomy, not a clinical zero-shot taxonomy, and no broad clinical canonical-label system is introduced in this issue.

Acceptance criteria

  • get_default_labels("cardiology") returns a non-empty label set and available_domains() includes cardiology.
  • Cardiology text (e.g. Echocardiogram shows reduced ejection fraction of 35%) routes to the Cardiology category via _match_categories.
  • The cardiology domain reuses the existing display-label style (asserted by a style-consistency test across all domains).
  • _CATEGORY_ENTITY_TYPES["Cardiology"] added as forward metadata; CATEGORIES still has no Cardiology key (no model exists yet).

Testing

  • New cardiology tests pass; full unit suite passes (.venv/bin/python -m pytest tests/ -q -> 1619 passed, 12 skipped).
  • The only 2 failures in the full run are pre-existing, network-gated integration tests (tests/integration/test_sentence_detection_real.py, which download dslim/bert-base-NER); they fail identically on master offline and are unrelated to this change.
  • ruff check and ruff format clean on the changed files.

Out of scope

  • Training a cardiology model (routing/labels only).
  • ECG signal processing (text only).
  • A clinical canonical-label system.

Closes #317

Adds a 'cardiology' domain to the zero-shot default label map and keyword
routing for cardiology text, closing the gap where cardiology notes fell
back to generic clinical labels.

- defaults.json: new 'cardiology' domain reusing the existing display-label
  style (CardiacFinding, ECGFinding, EjectionFraction, CardiacProcedure,
  CardiacDevice, Anatomy).
- model_registry: extract the keyword->category matching into a
  _match_categories(text) helper so cardiology text routes to the Cardiology
  category independently of whether a Cardiology model is registered;
  get_model_suggestions() user-facing behavior is unchanged.
- model_registry: add _CATEGORY_ENTITY_TYPES['Cardiology'] as forward
  metadata for future Cardiology models (no such model exists today).
- tests: cover available_domains/get_default_labels, label style and
  uniqueness, helper routing for the echocardiogram example, and that
  get_model_suggestions behavior is unchanged.

Closes maziyarpanahi#317
@maziyarpanahi

Copy link
Copy Markdown
Owner

Hi @pardeep-singh, thanks for the careful implementation and for keeping get_model_suggestions() behavior unchanged. CI is green and the scope matches the decision in #317.

I’ll review this one first since #552 is stacked on it. I’ll focus on the _match_categories helper, label-map consistency tests, and making sure Cardiology stays registry metadata rather than implying a shipped model.

@maziyarpanahi maziyarpanahi self-requested a review June 21, 2026 15:25
@maziyarpanahi

Copy link
Copy Markdown
Owner

Thank you @pardeep-singh. I reviewed this against #317 / OM-152 and the follow-up clarification on the issue.

The implementation matches the intended shape: cardiology is added to the zero-shot defaults, _match_categories() exposes Cardiology keyword routing independently of registered models, get_model_suggestions() still falls back to the existing general suggestions because no Cardiology model is registered, and _CATEGORY_ENTITY_TYPES["Cardiology"] is only forward registry metadata.

I did not need to change the branch. Local verification on the PR checkout:

  • PYTHONPATH=/tmp/openmed-pr-551.IO6dJS /Users/maziyar/Developer/openmed/.venv/bin/python -m pytest tests/unit/ner/test_label_map_consistency.py -q -> 54 passed
  • /Users/maziyar/Developer/openmed/.venv/bin/ruff check openmed/core/model_registry.py tests/unit/ner/test_label_map_consistency.py -> passed
  • /Users/maziyar/Developer/openmed/.venv/bin/ruff format --check openmed/core/model_registry.py tests/unit/ner/test_label_map_consistency.py -> passed

Hosted CI is green and the PR is mergeable. I also copied the labels from #317 onto the PR.

@maziyarpanahi maziyarpanahi added feature New capability good first issue Good for newcomers help wanted Extra attention is needed P3 Strategic roadmap-v2 OpenMed V2 roadmap backlog labels Jun 21, 2026
@pardeep-singh

Copy link
Copy Markdown
Contributor Author

@maziyarpanahi are we good with this PR? if so, can we please merge this so that I can rebaae the other one?

@maziyarpanahi maziyarpanahi left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved. I did the final pass against #317 / OM-152 and the issue clarification.

This is ready to merge: the scope matches the agreed helper-based routing approach, labels are copied from the issue, hosted CI is green, and GitHub reports the branch as clean/mergeable.

Thanks again @pardeep-singh.

@maziyarpanahi maziyarpanahi merged commit 3920e5d into maziyarpanahi:master Jun 21, 2026
12 checks passed
@maziyarpanahi

Copy link
Copy Markdown
Owner

Thanks @pardeep-singh this is merged now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature New capability good first issue Good for newcomers help wanted Extra attention is needed P3 Strategic roadmap-v2 OpenMed V2 roadmap backlog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add cardiology domain to NER catalog and zero-shot label maps

2 participants