Skip to content

ADM1 and ADM2 shape names contain ? (UTF-8 encoding issue?) #22

@2238154

Description

@2238154

Hello,

I have accessed the SSCGS data across all admin levels for all countries and have noticed that 126 ADM1 regions contain one or multiple ? within their names (in some cases the whole name comprises of question marks).

These are specifically found in countries: AZE, CZE, DZA, HRV, HUN, JOR, MLT, MNE, POL, RUS, SVK, TKM, TUR and YEM.

The same issue also exists in 3551 ADM2 regions of countries: AZE, BIH, CZE, DZA, FIN, GNB, HRV, HUN, HTI, IRL, JOR, JPN, KAZ, LTU, MAR, MLT, MNE, NOR, POL, RUS, SOM, SVK, TUR and YEM.

I checked to see if this was the case in HPSCU, HPSCGS and SSCU data, and sadly the same issue exists although i have not checked if its the exact same regions that have their names modified.

I haven't been able to fix this by enabling UTF-8 encoding and the issue is not present on the files that can be downloaded directly from the geoboundaries website. I considered using those files to substitute the modified names with the correct ones but as the shapeID differs between the website files and the rgeoboundary files this is not possible.

For my purpose of use the rgeoboundary files are excellent as they inform me which is the parent ADM1 and AMD0 region for each ADM2 region unlike the website files.

If this issue could be looked into and corrected I would be very grateful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions