Skip to content

Commit 859dc76

Browse files
authored
ops
1 parent e088c9b commit 859dc76

File tree

1 file changed

+17
-2
lines changed

1 file changed

+17
-2
lines changed

README.md

+17-2
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,12 @@ All relevant feature at OSM can be tagged with [`key:wikidata`](https://wiki.ope
1010
When, at Wikidata infrastructure, at the pointed semantic (a Wikidata ID) there are also a pointer to OSM, the "semantic bridge" has been built (!), so there are a complete [authority control with reciprocal use](https://www.wikidata.org/wiki/Q24075706). The [`lookup.csv` table](data/lookup.csv) list the OSM features that offers this reciprocity.
1111

1212
At July 2018 there are:
13+
14+
* [~1,123,500 OSM features](https://taginfo.openstreetmap.org/search?q=wikidata#keys) with a `wikidata` key.
15+
1316
* [**~63000** Wikidata entities](https://query.wikidata.org/#SELECT%20%28COUNT%28DISTINCT%20%3Fitem%29%20AS%20%3Fcount%29%20WHERE%20%7B%3Fitem%20wdt%3AP402%20%5B%5D.%7D%0A) with the [OSM relation ID (`P402`)](http://wikidata.org/entity/P402) property pointing to OSM.
14-
* 10 checked items at the lookup table, ensuring that each OSM feature was really tagged with a reciprocal Wikidata identification.
17+
18+
* ~1900 checked items at the lookup table, ensuring that each OSM feature was really tagged with a reciprocal Wikidata identification.
1519

1620
## The lookup as certification
1721

@@ -28,10 +32,21 @@ wdId|osm_type|osm_id|isReciprocal|check_date
2832
* `wdId`: the Wikidata ID, can be resolved by `http://wikidata.org/entity/{wdId}`
2933
* `osm_type`: the OSM datatype used to represent the feature. `R`=Relation (polygon), `W`=Way (line), `N`=Node (point).
3034
* `osm_id`: the ID attributed to OSM feature in the check_date.
35+
36+
The lookup not need all these fields, but as illustration above we add:
3137
* `isReciprocal`: a flag to say that the Wikidata and OSM indications are reciprocal or not (`y` or `n`).
3238
* `check_date`: an [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) date, when last checking procedure was performed.
3339

34-
The lookup is generated by software, see [`/src`](src).
40+
The lookup and [its CSV for error log](data/lookup_errors_WIKIDATA.csv) (`lookup_errors_WIKIDATA`) are generated by software, see [`/src`](src).
41+
42+
## Dump as source for comparisions
43+
44+
There are two big [*dump* files](https://en.wikipedia.org/wiki/Database_dump) at [`data/dump`](data/dump) folder:
45+
46+
* [osm_relation.csv](data/dump/osm_relation.csv) with pairs of *osm_relationId-wdId* fields;
47+
* [osm_way.csv](data/dump/osm_way.csv) with pairs of *osm_relationId-wdId* fields;
48+
49+
As commented at ["Preparing OSM dumps"](src/README.md#preparing-osm-dumps), we can express it by Overpass and generate samples, but not do the real task, because is really big. We can be split into countryes, and it will be better to use with specialized curators... But even splitting we need OSMium tools to generate the dump files. So the v0.1 checking is using the online tools, that is a lazzy solution, so the project is producing only samples.
3550

3651
------
3752

0 commit comments

Comments
 (0)