Skip to content

Commit a2b84a0

Browse files
authored
[docs][ci skip] Adding parser label definitions to the README
1 parent 2b3a6f6 commit a2b84a0

File tree

1 file changed

+14
-0
lines changed

1 file changed

+14
-0
lines changed

README.md

+14
Original file line numberDiff line numberDiff line change
@@ -148,6 +148,20 @@ int main(int argc, char **argv) {
148148
}
149149
```
150150
151+
Parser labels
152+
-------------
153+
154+
The address parser can use any string labels that are defined in the training data, but these are the default labels, based on the fields defined in [OpenCage's address-formatting library](https://github.com/OpenCageData/address-formatting):
155+
156+
- **house**: venue name e.g. "Brooklyn Academy of Music", and building names e.g. "Empire State Building"
157+
- **house_number**: usually refers to the external (street-facing) building number. In some countries this may be a compount, hyphenated number which also includes an apartment number, or a block number (a la Japan), but libpostal will just call it the house_number for simplicity.
158+
- **road**: street name(s)
159+
- **suburb**: usually an unofficial neighborhood name like "Harlem", "South Bronx", or "Crown Heights"
160+
- **city_district**: these are usually boroughs or districts within a city that serve some official purpose e.g. "Brooklyn" or "Hackney" or "Bratislava IV"
161+
- **city**: any human settlement including cities, towns, villages, hamlets, localities, etc.
162+
- **state_district**: usually a second-level administrative division or county.
163+
- **state**: a first-level administrative division. Scotland, Northern Ireland, Wales, and England in the UK are mapped to "state" as well (convention used in OSM, GeoPlanet, etc.)
164+
- **country**: sovereign nations and their dependent territories, anything with an [ISO-3166 code](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2).
151165
152166
Examples of normalization
153167
-------------------------

0 commit comments

Comments
 (0)