Description
While investigating a bug in a separate plugin, I found some oddities in the default packed data files provided by moment-timezone
.
Some zones in moment-timezone
are listed as links in the IANA source files, and vice versa. This is due to the way the compilation process creates the packed files.
- The IANA source files are downloaded and compiled using
zic
into binaryTZif
files — one per zone or link. At this point, the information about which identifiers areZone
s and which areLink
s is lost. - The compiled binary files are exported to text using
zdump
, then compiled into a raw JSON file in thedata/unpacked
directory. - The unpacked data is compressed into the
packed
file, combining zones with identical data into a single source zone and multiple links. Because there's no information brought over from the IANA source files, the choice of which zone is the source and which are links can differ from IANA.
For most use cases, this difference doesn't really matter, because moment-timezone
transparently handles the links just the same as source zones. Where it gets odd is with the relatively new countries
data. Now there are some countries that list some links
as their primary zones, while others just point straight at the source zone.
I ran a quick script to identify these outliers:
// Countries containing links as primary zones:
[
{ name: 'MM', zones: [ 'Asia/Yangon' ] },
{ name: 'SG', zones: [ 'Asia/Singapore' ] },
{ name: 'TV', zones: [ 'Pacific/Funafuti' ] },
{ name: 'UM', zones: [ 'Pacific/Wake' ] },
{ name: 'US', zones: [ 'America/Indiana/Indianapolis' ] },
{ name: 'WF', zones: [ 'Pacific/Wallis' ] }
]
Edit: There are actually more than that listed in latest.json
vs 2019c.json
— see #836
For reference, this is the script...
const tzdata = require('./data/packed/2019c.json');
const countries = tzdata.countries.map(country => {
const [name, zonesStr] = country.split('|');
const zones = zonesStr.split(' ');
return { name, zones };
});
const linkMap = new Map();
tzdata.links.forEach(link => {
const [target, name] = link.split('|');
linkMap.set(name, target);
});
const withLinks = countries
.map(country => ({
name: country.name,
zones: country.zones.filter(zone => linkMap.has(zone))
}))
.filter(country => country.zones.length);
console.log('// Countries containing links as primary zones:');
console.log(withLinks);
Investigating further, I ran a script to identify all the links and zones in moment-timezone
data that differ from the IANA source files:
The script I ran...
const tzdata = require('./data/packed/2019c.json');
const fs = require('fs');
const files = 'africa antarctica asia australasia etcetera europe northamerica southamerica pacificnew backward'.split(' ');
const sourceLinkZoneLine = /^(Link|Zone)\s/;
const ianaLinkMap = new Map();
const ianaZoneSet = new Set();
files.forEach(file => {
const contents = fs.readFileSync(`./temp/download/2019c/${file}`, 'utf-8');
contents
.split('\n')
.filter(line => sourceLinkZoneLine.test(line))
.forEach(line => {
if (line.startsWith('Link')) {
const [, source, name] = line.split(/\s+/);
ianaLinkMap.set(name, source);
} else {
const [, name] = line.split(/\s+/);
ianaZoneSet.add(name);
}
});
});
const ianaLinksAsMomentZones = [];
const ianaZonesAsMomentLinks = [];
const wrongLinkTargets = [];
const zoneSet = new Set();
tzdata.zones.forEach(packedZone => {
const [name] = packedZone.split('|');
zoneSet.add(name);
if (ianaLinkMap.has(name)) {
ianaLinksAsMomentZones.push(name);
}
});
tzdata.links.forEach(packedLink => {
const [target, name] = packedLink.split('|');
if (!ianaLinkMap.has(name) || ianaZoneSet.has(name)) {
ianaZonesAsMomentLinks.push(name);
} else if (ianaLinkMap.get(name) !== target) {
wrongLinkTargets.push({
linkName: name,
momentTarget: target,
ianaTarget: ianaLinkMap.get(name),
});
}
});
console.log({
ianaLinksAsMomentZones,
ianaZonesAsMomentLinks,
wrongLinkTargets,
});
The results...
{
ianaLinksAsMomentZones: [ 'America/Fort_Wayne', 'Asia/Rangoon', 'Etc/GMT-0' ],
ianaZonesAsMomentLinks: [
'America/Indiana/Indianapolis',
'Asia/Singapore',
'Asia/Yangon',
'Etc/GMT+2',
'Etc/GMT',
'Etc/GMT-7',
'Etc/GMT-9',
'Etc/GMT-10',
'Etc/GMT-12',
'Pacific/Funafuti',
'Pacific/Wake',
'Pacific/Wallis'
],
wrongLinkTargets: [
{
linkName: 'America/Indianapolis',
momentTarget: 'America/Fort_Wayne',
ianaTarget: 'America/Indiana/Indianapolis'
},
{
linkName: 'US/East-Indiana',
momentTarget: 'America/Fort_Wayne',
ianaTarget: 'America/Indiana/Indianapolis'
},
{
linkName: 'Singapore',
momentTarget: 'Asia/Kuala_Lumpur',
ianaTarget: 'Asia/Singapore'
},
{
linkName: 'Etc/GMT+0',
momentTarget: 'Etc/GMT-0',
ianaTarget: 'Etc/GMT'
},
{
linkName: 'Etc/GMT0',
momentTarget: 'Etc/GMT-0',
ianaTarget: 'Etc/GMT'
},
{
linkName: 'Etc/Greenwich',
momentTarget: 'Etc/GMT-0',
ianaTarget: 'Etc/GMT'
},
{
linkName: 'GMT',
momentTarget: 'Etc/GMT-0',
ianaTarget: 'Etc/GMT'
},
{
linkName: 'GMT+0',
momentTarget: 'Etc/GMT-0',
ianaTarget: 'Etc/GMT'
},
{
linkName: 'GMT-0',
momentTarget: 'Etc/GMT-0',
ianaTarget: 'Etc/GMT'
},
{
linkName: 'GMT0',
momentTarget: 'Etc/GMT-0',
ianaTarget: 'Etc/GMT'
},
{
linkName: 'Greenwich',
momentTarget: 'Etc/GMT-0',
ianaTarget: 'Etc/GMT'
}
]
}
It looks like this happens because the compression of multiple zones into links within filterLinkPack
works on a first-in basis. The first zone processed in a list of identical zones is made the source, with the rest being links to it. The zdump
data files are processed in alphabetical order, which explains why Asia/Rangoon
is a link to Asia/Yangon
in the IANA files, but it's the other way around in moment-timezone
(Asia/Rangoon
gets processed first in alphabetical order).
The tests I ran were on moment-timezone
0.5.28 and IANA data files for 2019c, but as far as I can tell this dates back to the first set of data files in moment-timezone
(2014a). I see the group-leaders.json
file was added to help keep links consistent, but I think it just encoded the link directions from the already-incorrect data. (See America/Fort_Wayne
, for example.)
Realistically this is a fairly minor problem due to the transparent handling of links. But sometimes the data files are processed by other scripts to be passed in to filterLinkPack
for custom builds, and I think having consistency with the source data is important.
Edit September 2022: After some changes in recent IANA releases, there are now countries defined in the IANA sources with a primary zone that's a link to a different zone. The assumption of at least one Zone
definition per country no longer holds true, so I don't think having links in moment-timezone
's country data is a problem any more. The mismatch between Zone
and Link
status still exists, though.