Skip to content

Feature/extend_map_phenotype #22

@VarenyaJ

Description

@VarenyaJ

Completed PRs/Issue

PR Title: PR #3 – Feature/extend_map_phenotype (Branch: feature/extend_map_phenotype → develop)
PR Type: Enhancement
Status: Completed

Background

Baseline phenotype mapping only extracted raw HPO IDs. This PR extended parsing to capture and validate both term labels and ontology consistency using hpotk.


Scope

Outline

Enhance phenotype mapping to support label + ID parsing and ontology validation.

Included/Required

  • Extended regex to parse both HPO term labels and numeric IDs.
  • Integrated hpotk ontology lookups.
  • Added warnings for obsolete IDs, label mismatches, missing ontology terms.
  • Added GENO ontology zygosity codes to Genotype.

Optional

  • Graceful handling of protobuf fields with try/except.
  • Backwards compatibility for sheets without labels.

Not included

  • No changes to preprocessing/audit pipeline.
  • No CLI options added.

Technical Plan / Implementation Details

  • Updated mapper._map_phenotype to use hpotk.TermId validation.
  • Batch validation via ObsoleteTermIdsValidator, PhenotypicAbnormalityValidator, and AnnotationPropagationValidator.
  • Introduced Genotype.zygosity_code property → GA4GH GENO CURIE mapping.
  • Extended loader to normalize "HPO Term" columns.

Validation & Testing

  • Tested against Python_headers_phenocopy_transformation.xlsx and Sydney_Python_transformation.xlsx.
  • Verified ontology lookups and warning generation.
  • All existing test suites passed without regressions.

Milestones

  • Extend HPO parsing with labels.
  • Integrate ontology validation.
  • Add GENO zygosity code mapping.
  • Confirm tests pass with new sample files.

Outcome

  • Phenotype mapping is more robust and ontology-aware.
  • CLI now warns about obsolete IDs, label mismatches, and unparsable cells.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions