Skip to content

Come up with way to define rarity/risk of dataset dissapearing #35

@diegoripley

Description

@diegoripley

Brainstorming, will run by LLM later:

  • sovereignty
  • events (ex. data center that hosts file is in conflict zone)
  • funding (ex. Pelias Canada disappeared due to no funding). If a provider does not show disseminate much funding they have, that is a risk
  • file format used (ECW, MrSID, Esri ArcInfo Interchange file format used by StatCan for older releases)
  • number of replicas
  • last time performed hash check of file. This one is more for non object store hosted datasets, but even in object store data can corrupt if you don't set it up properly
  • number of live replicas. You don't necessarily need a 100% replica. For example, for the Quebec orthoimagery I'm about to bring in (27TB), we can prioritize newer data, data for areas with higher number of people vs rural areas, etc. You can get really creative here

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions