Skip to content

Conversation

@caufieldjh
Copy link
Collaborator

This pull request makes several updates to taxonomic hierarchy mappings, product documentation, and resource metadata. The main focus is on expanding the list of NCBI Taxon IDs in the taxon_mapping.yaml file, streamlining documentation for the 1000 Genomes resource, and updating warning messages for various products.

Taxonomy mapping and hierarchy updates:

  • Expanded the taxon_hierarchy in registry/taxon_mapping.yaml by adding numerous new NCBI Taxon IDs, including entries for various species and clades, and updated their hierarchical relationships. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]

1000 Genomes resource documentation improvements:

  • Removed the separate Aspera documentation file (1000genomes.aspera-docs.md) and merged Aspera instructions into the Globus documentation, updating the description and product URL accordingly. [1] [2] [3]
  • Cleaned up the 1000 Genomes product list by removing the DDBJ mirror entry and outdated warning messages.

Resource metadata and warning updates:

  • Updated warning messages in several product metadata files to reflect the most recent check dates (from 2025-12-13 to 2025-12-15) for aeo, afpo, and alliance resources. [1] [2] [3] [4]

Data source classification correction:

  • Corrected the complexportal entry in reports/missing_infores_ids.tsv from "Aggregator" to "DataSource".

@caufieldjh caufieldjh marked this pull request as ready for review December 15, 2025 18:24
@caufieldjh caufieldjh merged commit 9169954 into main Dec 15, 2025
4 checks passed
@caufieldjh caufieldjh deleted the data_cleanup_15122025 branch December 15, 2025 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants