Skip to content

When data replaced, if file checksum changed force the change in version number #861

@mo-garethsjones

Description

@mo-garethsjones

Is your feature request related to a problem? Please describe

Multple results returned but with same version number. Investigation of files metadata and checksum show (and looking at the files after downloading) that files have been replaced with different data present.
e.g., via https://esgf-node.llnl.gov/
GISS-E2-1-G, ssp245, tas, mon, r1i1p1f2 : 4 results presented all with same version number 20200115. Three of the results comprise two files, one has 10 files. Timestamps and checksums differ between the results.
There is no way to know what is the "correct" data to download without digging into the metadata, and even then can it be trusted?

Describe the solution you'd like

Prevent insitutions replacing data which have changed number of files, and checksums without also changing the version number.

Describe alternatives you've considered

The metagrid could also enable searching for the timestamps, but I don't know how that is defined. Is it node dependent?

Additional context

To demonstrate the importance in distinguising between versions. For the above example, the global annual mean TAS is quite different. The below graph shows the global annual mean TAS, for the results with different timestamps. I was unable to find any reference to this in the ESGF errata.
Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions