Open
Description
I've run into an odd issue working with some data found in the wild. Consider this MRE:
library(sf)
library(s2)
# sf polygon as loaded from a database
poly <- structure(list(list(structure(c(12.9718058342186, 12.9718058342186,
12.9728528296958, 12.9994465148162, 13.0258308008412, 13.0090788732063,
12.9931645419531, 12.9676178523098, 12.9529599156293, 12.9223876476955,
12.8830206177534, 12.8428159914296, 12.7963293922426, 12.7460736093378,
12.7305012888338, 12.7291122826075, 12.7257618970805, 12.725552497985,
12.7259712961759, 12.7351848563751, 12.7542401740599, 12.7718296980765,
12.7860688365662, 12.785931374378, 12.801773768724, 12.8160129072137,
12.8537047443923, 12.9047981236788, 12.9491907319114, 12.9718058342186,
-4.04623539727417, -4.04623539727417, -4.05062180563782, -4.0750599304832,
-4.1013770804975, -4.10806066333969, -4.12226309084477, -4.14356625610239,
-4.16236269020371, -4.18742390394239, -4.21331964830488, -4.22626719749194,
-4.23169675058643, -4.2296084654374, -4.23006522722775, -4.21833162843474,
-4.18136751731289, -4.15421762373774, -4.1429397005941, -4.1201745144563,
-4.09281365829253, -4.07861070745388, -4.06858494389262, -4.06833641784592,
-4.05960342472729, -4.05166618507748, -4.04289335590868, -4.04080457305779,
-4.0433111118331, -4.04623539727417), dim = c(30L, 2L)))), class = c("XY",
"MULTIPOLYGON", "sfg"))
# the polygon contains degenerate vertices, so we fix it using s2_union()
as_s2_geography(poly) # Error: Loop 0 is not valid: Edge 0 is degenerate (duplicate vertex)
as_s2_geography(poly, check = FALSE) |> s2_union() |> s2_is_valid() # TRUE
# now it gets weird... surely this tiny area does cover the entire Earth globe?
as_s2_geography(poly, check = FALSE) |> s2_union() |> s2_area() # ~ 510e6 km2
as_s2_geography(poly, check = FALSE, oriented = FALSE) |> s2_union() |> s2_area() # ~ 510e6 km2
as_s2_geography(poly, check = FALSE, oriented = FALSE, planar = TRUE) |> s2_union() |> s2_area() # ~ 510e6 km2
# doing a roundtrip conversion to sf fixes the area calculation
as_s2_geography(poly, check = FALSE) |> s2_union() |> st_as_sf() |> as_s2_geography() |> s2_area() # ~ 488 km2
# lwgeom also produces the correct calculation
lwgeom::st_geod_area(st_sfc(poly, crs = "WGS84")) # ~ 488 km2
It seems that s2 interprets the polygon as a hole on the Earth’s surface, which kind of makes sense due to the CW orientation of the vertices in the data. There are a few things that confuse me, however:
- Shouldn't
oriented = FALSE
take care of this by reorienting the polygon so that the exterior area is larger than the interior area? - Why does the orientation change when converting it back to
sf
?
In general, how should one deal with situations like this? And how does one catch it to ensure correct data treatment? Right now, what I am doing is manually checking the polygon orientation and reversing the vertex sequence if needed (with logging to double-check the data later), which seems cumbersome?
P.S. If I manually remove the duplicate vertex at index 2, the result is as expected:
xy <- st_coordinates(poly) |> as_tibble()
xy <- xy[-2, ] # remove the 2 vertex
pp <- s2_make_polygon(xy$X, xy$Y, oriented = FALSE)
s2_area(pp) # ~ 488 km2
Metadata
Metadata
Assignees
Labels
No labels