Skip to content

Conversation

@glamberson
Copy link
Collaborator

Summary

This PR introduces the GEDCOM Evidence Extension v2.0, a complete redesign to comply with GEDCOM 7 constraints while providing comprehensive evidence handling capabilities.

Background

The previous version (v0.1) in PR #178 was closed due to GEDCOM rule violations including polymorphic pointers and circular dependencies. This v2.0 redesign addresses all those issues using a dual-pattern approach.

Design Approach

The extension uses two complementary patterns:

Pattern 1: Shadow Records

  • Evidence exists as independent records (_EVID)
  • Individuals/families reference evidence through _EVREF
  • Supports "floating evidence" for uncertain identities

Pattern 2: Event Containers

  • Evidence directly tied to an individual through _EVEN_EVID
  • For cases where identity is certain

Key Structures

  1. _EVID - Evidence container record
  2. _EVREF - Reference from individual/family to evidence
  3. _EVEN_EVID - Evidence event (attached to individual)
  4. _FIND - Specific findings extracted from evidence
  5. _SUBJ_INDI, _SUBJ_FAM, _SUBJ_SOUR - Subject references (avoiding polymorphism)
  6. _CONF - Confidence levels (High, Medium, Low, Hypothesis)
  7. _USED - How evidence was used
  8. _ANAL - Analysis notes
  9. _DTYPE - Document type

Example Usage

0 @E1@ _EVID
1 _DTYPE Census
1 DATE 7 JUN 1850
1 SOUR @S1@
2 PAGE Line 15
1 _FIND John Smith, age 42
2 TYPE Name and Age
1 _SUBJ_INDI @I1@
2 ROLE Possible match

0 @I1@ INDI
1 NAME John /Smith/
1 _EVREF @E1@
2 _CONF High
2 _USED Age matches other records

Compliance

  • All polymorphic pointers eliminated
  • Proper pointer payload formats used
  • All files validated against registry schema
  • Follows GEDCOM 7 extension rules

Documentation

Validation

All 16 YAML files pass validation:

  • 11 structure files
  • 4 enumeration files
  • 1 enumeration set file

Contact: Greg Lamberson [email protected]

Complete redesign to comply with GEDCOM 7 constraints, using a dual-pattern
approach:

1. Shadow Records Pattern - Evidence as independent records (_EVID) that can
   be referenced by multiple individuals through _EVREF
2. Event Container Pattern - Evidence directly tied to an individual through
   _EVEN_EVID

Key features:
- Eliminates polymorphic pointers by using separate _SUBJ_INDI, _SUBJ_FAM,
  and _SUBJ_SOUR structures
- Supports floating evidence for uncertain identities
- Preserves source information exactly as found
- Tracks confidence levels and analysis
- Documents research process per Genealogical Proof Standard

All structures validated against GEDCOM registry schema.

Specification: https://github.com/glamberson/gedcom-evidence
Contact: Greg Lamberson <[email protected]>

uri: https://github.com/glamberson/gedcom-evidence/enum-High

extension tags: [ _HIGH ]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
extension tags: [ _HIGH ]
extension tags:
- _HIGH

Luther commented on my PR:

The format description's style guide says

Sequences and mappings should be in the block style unless they are empty, in which case a flow style should be used instead.

This same change would apply in many other places in this PR

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Dave,

I'm fixing these now. Sorry, about the contuing trouble. However, it is extremely challenging to find the correct information to do this. The format description style guide link you provide, for example, is maddeningly difficult to find. I still don't know how to get to it, so I'm glad you provided the link. Going to gedcom.io and trying to find anything is frankly very difficult. Please consider this and make a clear set of instructions with well organized documentation, especially on gedcom.io. Just look at the URL. The first sublevel is TERMS. What on earth is that, and where is the navigation to that on gedcom.io? I can't find it or decipher it.

Thanks!

Changed all flow-style arrays [ item ] to block style as required:
  extension tags:
    - _TAG

This addresses Dave Thaler's review comment on PR FamilySearch#186 citing the GEDCOM
format style guide requirement that 'Sequences and mappings should be in
the block style unless they are empty.'

Fixed 14 YAML files across enumeration and structure directories.
- Reset registry_tools/GEDCOM.io submodule to match upstream main branch
@glamberson
Copy link
Collaborator Author

I reset this one also.

Copy link
Collaborator

@dthaler dthaler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the YAML files contain no used by label. That's fine if this is an unimplemented proposal. If, on the other hand, it is being implemented in something (e.g., gramps), then add a used by label when it actually appears in GEDCOM files used by that app.

specification:
- Confidence levels
- Set of confidence levels for evidence assessment

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend adding this to files in this PR:

documentation:
  - https://github.com/glamberson/gedcom-evidence

specification:
- Document Type
- |
Specifies the type of document or record that contains the evidence.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the _DTYPE structure used for that can't be obtained from the SOUR structure in parallel to it? I couldn't find the answer at https://github.com/glamberson/gedcom-evidence but may have missed it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants