Skip to content

Duplicate detections in 2010_phd_reubens_sync #78

@peterdesmet

Description

@peterdesmet

2010_phd_reubens_sync contains 77.426 detections with duplicate pk (38713 to be removed). This seems to be caused by 2 tags:

tag_id tag_fk animal_id (Number of Rows)
A69-1303-65302 67 733 112 + 83.278 non duplicates
A69-1303-65302 85 747 112 (only records)
A69-1303-65303 68 2159 38601
A69-1303-65303 68 734 38601
  • A69-1303-65302: is listed 3 times in tags: 96 (AVAILABLE), 67 (ENDED), 85 (AVAILABLE). I don't think the tag should be listed as AVAILABLE twice, but that does not seem to be causing the issue, because A69-1303-65301 is also listed twice as available. The issue seems to be with 85: all duplicates are coming from this tag.
  • animal 734 and 2159 are complete duplicates. I would suggest to remove 2159

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions