-
Notifications
You must be signed in to change notification settings - Fork 22
Description
Summary
Clarify and consolidate pedigree representation to avoid duplication and divergence across Individuals.
Details
In Beacon v2 every Individual object may embed one or more full Pedigree objects. This, at glance, looks like a separate entity from Phenopackets is embedded in a beacon entity, while still trying to represent its individuality. To the point that the individuals /examples/ contains a separate example exclusively for a pedigree.
That alone wouldn't be an issue, if it wasn't because individuals and pedigrees are linked many-to-many. When multiple members of the same family are present in a dataset (as probands) this leads to:
- Heavily duplicated pedigree payloads (one copy per individual), containing the whole pedigree.
- Potential inconsistencies if copies are out of sync (e.g., having to keep
numSubjectsin sync at every record). - Ambiguity for clients trying to reconcile family relationships across query results. Especially if different
ids were given to the same pedigrees in different individuals.
The current schema does not provide guidance on how servers should keep those embedded pedigrees canonical, nor on preferred linking patterns when a single pedigree applies to several individuals. It is implicit in the documentation that the pedigree representation can be asymmetric, but that doesn't fully cover the issues listed here.
Suggestions
Add clarifying language to the spec:
- State explicitly whether duplicated pedigree objects with the same id must be byte-identical.
- Recommend a single-source-of-truth strategy for shared pedigrees.
Introduce an optional dereferenceable reference pattern:
- Separate pedigree as its own entity.
- Allow
Individual.pedigrees[*]to contain a minimal stub such as{ "id": "Pedigree123" }. - Encourage servers to expose a dedicated
/pedigrees/{id}endpoint (or similar) that returns the full object.
Provide a conformance example demonstrating how two siblings can reference the same pedigree without payload duplication.