feat: add cardiology domain to NER catalog and zero-shot label maps#551
Conversation
Adds a 'cardiology' domain to the zero-shot default label map and keyword routing for cardiology text, closing the gap where cardiology notes fell back to generic clinical labels. - defaults.json: new 'cardiology' domain reusing the existing display-label style (CardiacFinding, ECGFinding, EjectionFraction, CardiacProcedure, CardiacDevice, Anatomy). - model_registry: extract the keyword->category matching into a _match_categories(text) helper so cardiology text routes to the Cardiology category independently of whether a Cardiology model is registered; get_model_suggestions() user-facing behavior is unchanged. - model_registry: add _CATEGORY_ENTITY_TYPES['Cardiology'] as forward metadata for future Cardiology models (no such model exists today). - tests: cover available_domains/get_default_labels, label style and uniqueness, helper routing for the echocardiogram example, and that get_model_suggestions behavior is unchanged. Closes maziyarpanahi#317
cda34c5 to
7aa8d75
Compare
|
Hi @pardeep-singh, thanks for the careful implementation and for keeping I’ll review this one first since #552 is stacked on it. I’ll focus on the |
|
Thank you @pardeep-singh. I reviewed this against #317 / OM-152 and the follow-up clarification on the issue. The implementation matches the intended shape: I did not need to change the branch. Local verification on the PR checkout:
Hosted CI is green and the PR is mergeable. I also copied the labels from #317 onto the PR. |
|
@maziyarpanahi are we good with this PR? if so, can we please merge this so that I can rebaae the other one? |
maziyarpanahi
left a comment
There was a problem hiding this comment.
Approved. I did the final pass against #317 / OM-152 and the issue clarification.
This is ready to merge: the scope matches the agreed helper-based routing approach, labels are copied from the issue, hosted CI is green, and GitHub reports the branch as clean/mergeable.
Thanks again @pardeep-singh.
|
Thanks @pardeep-singh this is merged now |
Implements #317 (OM-152): adds a
cardiologydomain to the zero-shot default label map plus cardiology keyword routing, so cardiology notes (ECG findings, ejection fraction, cardiac procedures/devices) stop falling back to generic clinical labels.Per the discussion on the issue, this follows 1(a) (routing helper,
get_model_suggestionsbehavior unchanged) and 2(a) (display labels consistent with existing domains; no new clinical canonical-label system).Changes
openmed/zero_shot/data/label_maps/defaults.json— newcardiologydomain using the existing letters-only display-label style:CardiacFinding, ECGFinding, EjectionFraction, CardiacProcedure, CardiacDevice, Anatomy.openmed/core/model_registry.pyget_model_suggestionsinto a_match_categories(text)helper that returns(category, reason)pairs, so cardiology text routes to theCardiologycategory independently of whether a Cardiology model is registered.get_model_suggestions()user-facing behavior is unchanged: with no Cardiology model registered, cardiology text still falls back to general medical suggestions rather than surfacing unrelated results._CATEGORY_ENTITY_TYPES["Cardiology"]as forward registry metadata for future Cardiology models (none exists today).tests/unit/ner/test_label_map_consistency.py— tests foravailable_domains()/get_default_labels("cardiology"), non-empty labels, no duplicates, a label-style-consistency check across all domains,_match_categoriesrouting for the echocardiogram example, and thatget_model_suggestionsbehavior is unchanged.openmed/core/labels.pyis intentionally untouched — per the issue discussion, the strictCANONICAL_LABELStaxonomy there is the PII/PHI policy taxonomy, not a clinical zero-shot taxonomy, and no broad clinical canonical-label system is introduced in this issue.Acceptance criteria
get_default_labels("cardiology")returns a non-empty label set andavailable_domains()includescardiology.Echocardiogram shows reduced ejection fraction of 35%) routes to theCardiologycategory via_match_categories._CATEGORY_ENTITY_TYPES["Cardiology"]added as forward metadata;CATEGORIESstill has noCardiologykey (no model exists yet).Testing
.venv/bin/python -m pytest tests/ -q-> 1619 passed, 12 skipped).tests/integration/test_sentence_detection_real.py, which downloaddslim/bert-base-NER); they fail identically onmasteroffline and are unrelated to this change.ruff checkandruff formatclean on the changed files.Out of scope
Closes #317