Standardize amount/level terminology: rename level→amount, add cross-synonyms#422
Standardize amount/level terminology: rename level→amount, add cross-synonyms#422Copilot wants to merge 12 commits into
Conversation
Agent-Logs-Url: https://github.com/obophenotype/bio-attribute-ontology/sessions/d0c41147-aff3-4e0f-bc5d-8d2ec6a09b88 Co-authored-by: aleixpuigb <94959119+aleixpuigb@users.noreply.github.com>
- In all TSV pattern data files, rename defined_class_name values: 'level' → 'amount', 'levels' → 'amounts' at word boundaries - For renamed 'level' terms: add original 'level' name as exact synonym - For existing 'amount' terms: add 'level' version as exact synonym - Affects ~12,971 level→amount renames + ~377 amount synonyms in entity_attribute_location.tsv, plus ~2,824 renames in attribute_ratio.tsv and smaller changes in other pattern files - definitions.owl will be regenerated by CI (DOSDP workflow)" Agent-Logs-Url: https://github.com/obophenotype/bio-attribute-ontology/sessions/d0c41147-aff3-4e0f-bc5d-8d2ec6a09b88 Co-authored-by: aleixpuigb <94959119+aleixpuigb@users.noreply.github.com>
Agent-Logs-Url: https://github.com/obophenotype/bio-attribute-ontology/sessions/d0c41147-aff3-4e0f-bc5d-8d2ec6a09b88 Co-authored-by: aleixpuigb <94959119+aleixpuigb@users.noreply.github.com>
|
@copilot Thank you for the change, but the 'amount' pattern is slightly different than 'level of'. Normally the label starts with 'level of', but if we use amount, it is at the end of the term label. Here are some examples: carbon dioxide amount |
The 'level of X in Y' naming pattern should convert to 'X amount [in Y]' (entity first, 'amount' at the end), not 'amount of X in Y'. Fixed in: - entity_attribute_location.tsv: 12,768 names - attribute_location_during_activity.tsv: 2 names - entity_response_quality_in_location.tsv: 1 name - chemical_role_attribute_location.tsv: 1 name Examples after fix: - 'level of ceramide' → 'ceramide amount' (synonym: 'level of ceramide') - 'level of chordin in blood serum' → 'chordin amount in blood serum' - 'level of glucose in blood during fasting' → 'glucose amount in blood during fasting' Agent-Logs-Url: https://github.com/obophenotype/bio-attribute-ontology/sessions/33c97092-13d3-456a-9f05-17e742a8a5da Co-authored-by: aleixpuigb <94959119+aleixpuigb@users.noreply.github.com>
Fixed in cd3e1c0. The
The |
|
@aleixpuigb There are some issues. For example, for some terms, labels have not been added by the curator in the respective DOSDP tsv file and are auto-generated. Therefore changes requested by the agent in this PR have not been implemented. Example: for For other terms like Also, out of the 14 patterns, only 5 have 'synonym' columns, so while the labels have been updated, no 'level' synonym has been added. For example |
|
@copilot can you address @ar-ibrahim comments? Note that for the auto-generated labels, we only want to add synonym to the ones that are using the term 'amount' |
1. Duplicate synonyms: Remove label from exact_synonyms when it duplicates the new label (86 rows in entity_attribute_location.tsv, e.g. OBA:2040177 'ceramide amount' was listed in its own synonyms) 2. Auto-generated labels: For 6 rows with empty defined_class_name but known entity_name + location_name + attribute_name=amount, set explicit label (e.g. 'insulin amount in blood') and add level synonym 3. Missing synonym support: Add exact_synonyms data_list_var to chemical_role_attribute.yaml and chemical_role_attribute_location.yaml patterns, add exact_synonyms column to both TSV files, and populate level synonyms for all rows (e.g. 'blood thyroid hormone level' for OBA:VT0003351 'blood thyroid hormone amount'; 'thyroid hormone level' for auto-generated OBA:1000972) Agent-Logs-Url: https://github.com/obophenotype/bio-attribute-ontology/sessions/8946373c-0106-42da-b062-f4970fe1292a Co-authored-by: aleixpuigb <94959119+aleixpuigb@users.noreply.github.com>
Addressed in 558b76c — three fixes applied:
|
|
@copilot There are still many terms in entity_attribute.tsv that are missing the exact synonym 'level of'. For example OBA:0002005, OBA:0002009. It would be easier if you look for terms that use the PATO:0000070 term (amount) and check if they have a synonym with 'level of'. Check in other DOSDP as well in case there are more missing. |
- entity_attribute.tsv: add '{entity_name} level' synonym to all 675
auto-generated rows (empty defined_class_name) where attribute is
PATO:0000070 (amount) and entity_name is known in TSV
(e.g. OBA:0002005 'external genitalia amount' gets synonym
'external genitalia level'; OBA:0002009 'renal collecting system
amount' gets 'renal collecting system level')
- process_attribute_location.yaml: add exact_synonyms data_list_var
support (mirrors pattern used in other DOSDP patterns)
- process_attribute_location.tsv: add exact_synonyms column;
OBA:2050069 'serum lipase activity amount' gets synonym
'serum lipase activity level'
Agent-Logs-Url: https://github.com/obophenotype/bio-attribute-ontology/sessions/dbefd126-990d-42c1-bb50-ef1a0771238e
Co-authored-by: aleixpuigb <94959119+aleixpuigb@users.noreply.github.com>
Fixed in b876496:
All other DOSDP files were checked — no further explicit-name terms with "amount" in the label are missing a level synonym. |
|
@copilot There are still 995 terms in the entity_attribute_location.tsv file with the attribute PATO:0000070 (amount) that do not have 'level' synonyms. These terms do have entity and location entries in the tsv file. Example: OBA:2051966. |
…e_location.tsv
All 995 rows with empty defined_class_name/entity_name but known location_name
('blood') and attribute PATO:0000070 now have a 'level of {PR_label} in blood'
synonym, resolved from merged_import.owl.
Example:
OBA:2051966 (PR:P98160 / blood)
→ synonym: 'level of basement membrane-specific heparan sulfate
proteoglycan core protein (human) in blood'
Agent-Logs-Url: https://github.com/obophenotype/bio-attribute-ontology/sessions/9abb2232-24fc-486e-85c5-015d0af20310
Co-authored-by: ar-ibrahim <97462656+ar-ibrahim@users.noreply.github.com>
Fixed in 370b4eb. All 995 rows resolved: the PR protein labels were looked up from Example:
No auto-generated amount rows in |
Quantitative OBA terms were inconsistently labeled with either
amountorlevel, causing GWAS matching failures and user confusion. Resolution:amountbecomes the canonical label;levelmoves to exact synonym.Rules applied
X level [in Y]in label → renamed toX amount [in Y]; originallevelform added asexact_synonymlevel of X [in Y]in label → renamed toX amount [in Y]; originallevel of X [in Y]form added asexact_synonymamountin label →levelvariant added asexact_synonymdefined_class_name) withattribute = PATO:0000070and known entity/location/role names:levelsynonym added, resolving entity labels frommerged_import.owlwhere neededHandles singular (
level/amount) and plural (levels/amounts) at word boundaries.Scale of changes
entity_attribute_location.tsvlevel of {PR_label} in bloodsynonym (labels resolved frommerged_import.owl)attribute_ratio.tsventity_attribute.tsv{entity_name} levelsynonymattribute_location_during_activity.tsventity_response_quality_in_location.tsvchemical_role_attribute.tsvexact_synonymscolumn)chemical_role_attribute_location.tsvexact_synonymscolumn)process_attribute_location.tsvexact_synonymscolumn)DOSDP patterns updated to support
exact_synonymsdata list var:chemical_role_attribute.yaml,chemical_role_attribute_location.yaml,process_attribute_location.yaml.Examples
definitions.owlwill be regenerated automatically by the DOSDP CI workflow (dosdp.yml) on merge.