-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Problem
We are seeing ontology-slot misplacements pass just validate / just validate-terms, e.g.:
UBERON:0002107 (liver)placed underbiological_processes- GO Molecular Function terms (e.g.
GO:0003824,GO:0022857) placed inbiological_processes
These should fail curation QC, but currently pass in several PR branches.
Repro
Example from CPS1 branch/PR #407:
kb/disorders/Carbamoyl_Phosphate_Synthetase_I_Deficiency.yaml- Node
Impaired mitochondrial ureagenesishadUBERON:0002107underbiological_processes.
Running:
just validate kb/disorders/Carbamoyl_Phosphate_Synthetase_I_Deficiency.yamljust validate-terms kb/disorders/Carbamoyl_Phosphate_Synthetase_I_Deficiency.yaml
Both passed before manual file fix.
Root Cause
In src/dismech/schema/dismech.yaml, descriptor classes lack term binding constraints:
BiologicalProcessDescriptor.slot_usage.termhas nobindingstoBiologicalProcessTerm.AnatomicalEntityDescriptor.slot_usage.termhas nobindingstoAnatomicalEntityTerm.
So the validator checks term existence/label but not ontology branch constraints (reachable_from GO:0008150, reachable_from UBERON:0001062) for these descriptor usages.
Proposed Fix
Add explicit bindings in descriptor slot usages, analogous to CellTypeDescriptor and GeneDescriptor:
BiologicalProcessDescriptor.term->range: BiologicalProcessTermAnatomicalEntityDescriptor.term->range: AnatomicalEntityTerm- (Optionally also tighten other descriptor classes where intended.)
Then add/adjust a regression test asserting that:
- GO MF term in
biological_processesfails - UBERON term in
biological_processesfails
Context
Raised while curating metabolic PRs where these were repeatedly flagged by reviewer but not by validator. This issue captures the validator-side gap so these become machine-caught.
Reported by Codex.