You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+17-2
Original file line number
Diff line number
Diff line change
@@ -10,8 +10,12 @@ All relevant feature at OSM can be tagged with [`key:wikidata`](https://wiki.ope
10
10
When, at Wikidata infrastructure, at the pointed semantic (a Wikidata ID) there are also a pointer to OSM, the "semantic bridge" has been built (!), so there are a complete [authority control with reciprocal use](https://www.wikidata.org/wiki/Q24075706). The [`lookup.csv` table](data/lookup.csv) list the OSM features that offers this reciprocity.
11
11
12
12
At July 2018 there are:
13
+
14
+
*[~1,123,500 OSM features](https://taginfo.openstreetmap.org/search?q=wikidata#keys) with a `wikidata` key.
15
+
13
16
*[**~63000** Wikidata entities](https://query.wikidata.org/#SELECT%20%28COUNT%28DISTINCT%20%3Fitem%29%20AS%20%3Fcount%29%20WHERE%20%7B%3Fitem%20wdt%3AP402%20%5B%5D.%7D%0A) with the [OSM relation ID (`P402`)](http://wikidata.org/entity/P402) property pointing to OSM.
14
-
* 10 checked items at the lookup table, ensuring that each OSM feature was really tagged with a reciprocal Wikidata identification.
17
+
18
+
*~1900 checked items at the lookup table, ensuring that each OSM feature was really tagged with a reciprocal Wikidata identification.
*`wdId`: the Wikidata ID, can be resolved by `http://wikidata.org/entity/{wdId}`
29
33
*`osm_type`: the OSM datatype used to represent the feature. `R`=Relation (polygon), `W`=Way (line), `N`=Node (point).
30
34
*`osm_id`: the ID attributed to OSM feature in the check_date.
35
+
36
+
The lookup not need all these fields, but as illustration above we add:
31
37
*`isReciprocal`: a flag to say that the Wikidata and OSM indications are reciprocal or not (`y` or `n`).
32
38
*`check_date`: an [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) date, when last checking procedure was performed.
33
39
34
-
The lookup is generated by software, see [`/src`](src).
40
+
The lookup and [its CSV for error log](data/lookup_errors_WIKIDATA.csv) (`lookup_errors_WIKIDATA`) are generated by software, see [`/src`](src).
41
+
42
+
## Dump as source for comparisions
43
+
44
+
There are two big [*dump* files](https://en.wikipedia.org/wiki/Database_dump) at [`data/dump`](data/dump) folder:
45
+
46
+
*[osm_relation.csv](data/dump/osm_relation.csv) with pairs of *osm_relationId-wdId* fields;
47
+
*[osm_way.csv](data/dump/osm_way.csv) with pairs of *osm_relationId-wdId* fields;
48
+
49
+
As commented at ["Preparing OSM dumps"](src/README.md#preparing-osm-dumps), we can express it by Overpass and generate samples, but not do the real task, because is really big. We can be split into countryes, and it will be better to use with specialized curators... But even splitting we need OSMium tools to generate the dump files. So the v0.1 checking is using the online tools, that is a lazzy solution, so the project is producing only samples.
0 commit comments