-
-
Notifications
You must be signed in to change notification settings - Fork 476
New feature: Check location of OSM objects against list of regions #2333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This will be useful later when we index geometries.
@giggls Have a look at this. Might be interesting for you to simplify/speed up l10n processing. |
m_data.emplace_back(region.box(), n++); | ||
} | ||
|
||
m_rtree.insert(m_data.cbegin(), m_data.cend()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think m_rtree gets cleared ever. So this function may produce duplicates in the index if someone calls locator.add() some time during the import in one of the callbacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. Fixed.
tests/bdd/flex/locator.feature
Outdated
NAME[] | ||
""" | ||
|
||
Scenario: Define a locator without name is okay |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Title not adapted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
if '-S' in context.osm2pgsql_params: | ||
context.osm2pgsql_params[context.osm2pgsql_params.index('-S') + 1] = str(outfile) | ||
else: | ||
context.osm2pgsql_params.extend(('-S', str(outfile))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please 'fix' the function below (setup_style_file()) in the same way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, fixed.
This commit introduces a new feature: Locators. A locator is initialized with one or more regions, each region has a name and a polygon or bounding box. A geometry of an OSM object can then be checked against this region list to figure out in which region(s) it is located. This check is much faster than it would be to do this inside the database after import. Locators can be used for all sorts of interesting features: * Read larger OSM file but import only data inside some area. * Annotate each OSM object with the country (or other region) it is in. This can then, for instance, be used to show special highway shields for each country. * Use the information which region the data is in for further processing, for instance setting of default values for the speed limit or using special language transliterations rules based on country. Locators are created in Lua with `define_locator()`. Bounding boxes can be added with `add_bbox()`. Polygons can be added from the database by calling `add_from_db()` and specifiying an SQL query which can return any number of rows each defining a region with the name and the (multi)polygon as columns. A locator can then be queried using `all_intersecting()` returning a list of names of all regions that intersect the specified OSM object geometry. Or the `first_intersecting()` function can be used which only returns a single region for those cases where there can be no overlapping data or where the details of objects straddling region boundaries don't matter. Several example config files are provided in the flex-config/locator directory.
I think it might also be relevant for this long running openstreetmap-carto issue: |
Just curious: what makes this processing by osm2pgsql so much faster than doing this in the database after the import? And how about memory usage of osm2pgsql when doing this type of processing. Any concerns to be aware of? (I guess not, as you already showed a kind of worst case scenario with country size locator regions and the massive building dataset of OpenStreetMap in the example configs). |
I think this is mostly because everything happens in memory. Once the Locator is set up, the checks run completely in memory. The OSM data is loaded anyway as part of the osm2pgsql processing, so it is available for the check. Doing this in the database means getting all the data from disk, doing all the checks and then doing lots of database updates involving more IO including all the clever things the database has to do to keep consistent and all that. There is just so much more work involved. Memory usage is negligable. |
This raises one more question: I can imagine it being very useful to have the 'all_intersecting()' function return geometries in a specific sorted order, e.g. from largest in terms of area to smallest, to be able to use that in subsequent processing in Lua. Think of a locator build on all OpenStreetMap boundary relations of all levels, where you would like to see the largest-to-smallest set of intersecting boundary relations for a specific OSM object. Does the current implementation implement any type of sort, or is the returned set's order completely arbitrary? |
It is completely arbitrary. It would be much more expensive to do this in any sorted way. For the use case you describe you'd do it differently anyways: You'll only use the smallest divisions and use that data for the rest of the information. You don't have to check a point against all states of the USA and against the USA boundary, you check it against all states and if it is in a state it must also be in the USA. If there is any area left that is in the USA but in no state, you'll have to also check against that. It needs a bit of pre-processing but that saves you a lot of checks later. |
This PR introduces a new feature: Locators. A locator is initialized with one or more regions, each region has a name and a polygon or bounding box. A geometry of an OSM object can then be checked against this region list to figure out in which region(s) it is located. This check is much faster than it would be to do this inside the database after import.
Locators can be used for all sorts of interesting features:
Locators are created in Lua with
define_locator()
. Bounding boxes can be added withadd_bbox()
. Polygons can be added from the database by callingadd_from_db()
and specifiying an SQL query which can return any number of rows each defining a region with the name and the (multi)polygon as columns.A locator can then be queried using
all_intersecting()
returning a list of names of all regions that intersect the specified OSM object geometry. Or thefirst_intersecting()
function can be used which only returns a single region for those cases where there can be no overlapping data or where the details of objects straddling region boundaries don't matter.Several example config files are provided in the flex-config/locator directory.