Completed PRs/Issue
PR Title: PR #3 – Feature/extend_map_phenotype (Branch: feature/extend_map_phenotype → develop)
PR Type: Enhancement
Status: Completed
Background
Baseline phenotype mapping only extracted raw HPO IDs. This PR extended parsing to capture and validate both term labels and ontology consistency using hpotk.
Scope
Outline
Enhance phenotype mapping to support label + ID parsing and ontology validation.
Included/Required
- Extended regex to parse both HPO term labels and numeric IDs.
- Integrated
hpotk ontology lookups.
- Added warnings for obsolete IDs, label mismatches, missing ontology terms.
- Added GENO ontology zygosity codes to
Genotype.
Optional
- Graceful handling of protobuf fields with
try/except.
- Backwards compatibility for sheets without labels.
Not included
- No changes to preprocessing/audit pipeline.
- No CLI options added.
Technical Plan / Implementation Details
- Updated
mapper._map_phenotype to use hpotk.TermId validation.
- Batch validation via
ObsoleteTermIdsValidator, PhenotypicAbnormalityValidator, and AnnotationPropagationValidator.
- Introduced
Genotype.zygosity_code property → GA4GH GENO CURIE mapping.
- Extended loader to normalize "HPO Term" columns.
Validation & Testing
- Tested against
Python_headers_phenocopy_transformation.xlsx and Sydney_Python_transformation.xlsx.
- Verified ontology lookups and warning generation.
- All existing test suites passed without regressions.
Milestones
Outcome
- Phenotype mapping is more robust and ontology-aware.
- CLI now warns about obsolete IDs, label mismatches, and unparsable cells.
Completed PRs/Issue
PR Title: PR #3 – Feature/extend_map_phenotype (Branch: feature/extend_map_phenotype → develop)
PR Type: Enhancement
Status: Completed
Background
Baseline phenotype mapping only extracted raw HPO IDs. This PR extended parsing to capture and validate both term labels and ontology consistency using
hpotk.Scope
Outline
Enhance phenotype mapping to support label + ID parsing and ontology validation.
Included/Required
hpotkontology lookups.Genotype.Optional
try/except.Not included
Technical Plan / Implementation Details
mapper._map_phenotypeto usehpotk.TermIdvalidation.ObsoleteTermIdsValidator,PhenotypicAbnormalityValidator, andAnnotationPropagationValidator.Genotype.zygosity_codeproperty → GA4GH GENO CURIE mapping.Validation & Testing
Python_headers_phenocopy_transformation.xlsxandSydney_Python_transformation.xlsx.Milestones
Outcome