Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
112 changes: 108 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,18 @@

Python-based CLI tool to index raster files to DGGS in parallel, writing out to Parquet.

Currently this supports the H3 and rHEALPix DGGSs. Contributions (particularly for additional DGGSs), suggestions, bug reports and strongly worded letters are all welcome.
Currently this supports the following DGGSs:

- [H3](https://h3geo.org/)
- [rHEALPix](https://datastore.landcareresearch.co.nz/dataset/rhealpix-discrete-global-grid-system)
- [S2](http://s2geometry.io/)

And these geocode systems:

- [Geohash](https://en.wikipedia.org/wiki/Geohash)
- [Maidenhead Locator System](https://en.wikipedia.org/wiki/Maidenhead_Locator_System)

Contributions (particularly for additional DGGSs), suggestions, bug reports and strongly worded letters are all welcome.

![Example use case for raster2dggs, showing how an input raster can be indexed at different DGGS resolutions, while retaining information in separate, named bands](docs/imgs/raster2dggs-example.png "Example use case for raster2dggs, showing how an input raster can be indexed at different H3 resolutions, while retaining information in separate, named bands")

Expand All @@ -23,8 +34,11 @@ Options:
--help Show this message and exit.

Commands:
h3 Ingest a raster image and index it to the H3 DGGS.
rhp Ingest a raster image and index it to the rHEALPix DGGS.
geohash Ingest a raster image and index it using the Geohash...
h3 Ingest a raster image and index it to the H3 DGGS.
maidenhead Ingest a raster image and index it using the Maidenhead...
rhp Ingest a raster image and index it to the rHEALPix DGGS.
s2 Ingest a raster image and index it to the S2 DGGS.
```

```
Expand Down Expand Up @@ -96,6 +110,9 @@ Output is in the Apache Parquet format, a directory with one file per partition.

For a quick view of your output, you can read Apache Parquet with pandas, and then use h3-pandas and geopandas to convert this into a GeoPackage for visualisation in a desktop GIS, such as QGIS. The Apache Parquet output is indexed by the DGGS column, so it should be ready for association with other data prepared in the same DGGS.

<details>
<summary>For H3 output...</summary>

```python
>>> import pandas as pd
>>> import h3pandas
Expand All @@ -118,6 +135,10 @@ h3_09
[5656 rows x 10 columns]
>>> o.h3.h3_to_geo_boundary().to_file('~/Downloads/Sen2_Test_h3-9.gpkg', driver='GPKG')
```
</details>

<details>
<summary>For rHEALPix output...</summary>

For rHEALPix DGGS output, you can use [`rHP-Pandas`](https://github.com/manaakiwhenua/rHP-Pandas):

Expand All @@ -143,6 +164,89 @@ R88727068808 22 39 43 80 146 163 177 198 165 83
[223104 rows x 10 columns]
>>> o.rhp.rhp_to_geo_boundary().to_file('~/Downloads/Sen2_Test_rhp-11.gpkg', driver='GPKG')
```
</details>

<details>
<summary>For S2 output...</summary>

For S2 output, use [`s2sphere`](https://pypi.org/project/s2sphere/):

```python
import pandas as pd
import geopandas as gpd
import s2sphere
from shapely.geometry import Polygon

df = pd.read_parquet('./tests/data/output/7/sample_tif_s2')
df = df.reset_index()

def s2id_to_polygon(s2_id_hex):
# Parse the S2CellId
cell_id = s2sphere.CellId.from_token(s2_id_hex)
cell = s2sphere.Cell(cell_id)

# Get the 4 vertices of the S2 cell
vertices = []
for i in range(4):
vertex = cell.get_vertex(i)
# Convert to lat/lon degrees
lat_lng = s2sphere.LatLng.from_point(vertex)
vertices.append((lat_lng.lng().degrees, lat_lng.lat().degrees)) # (lon, lat)

return Polygon(vertices)

df['geometry'] = df['s2_15'].apply(s2id_to_polygon)
gdf = gpd.GeoDataFrame(df, geometry='geometry', crs='EPSG:4326') # WGS84
gdf.to_parquet('sample_tif_s2_geoparquet.parquet')
```
</details>

<details>
<summary>For Geohash output...</summary>

For Geohash output, you can use [`python-geohash`] or other similar Geohash library. Example:

```python
import pandas as pd
import geohash
from shapely.geometry import Point, box
import geopandas as gpd
o = pd.read_parquet('./tests/data/output/8/sample_geohash')


def geohash_to_geometry(gh, mode="polygon"):
lat, lon, lat_err, lon_err = geohash.decode_exactly(gh)

if mode == "point":
return Point(lon, lat)
elif mode == "polygon":
return box(lon - lon_err, lat - lat_err, lon + lon_err, lat + lat_err)
else:
raise ValueError("mode must be 'point' or 'polygon'")

o["geometry"] = o["geohash_08"].apply(lambda gh: geohash_to_geometry(gh, mode="polygon"))

'''
band geohash_08 1 2 3 geometry
0 u170f2sq 0 0 0 POLYGON ((4.3238067626953125 52.16686248779297...
1 u170f2sr 0 0 0 POLYGON ((4.3238067626953125 52.16703414916992...
2 u170f2sw 0 0 0 POLYGON ((4.324150085449219 52.16686248779297,...
3 u170f2sx 0 0 0 POLYGON ((4.324150085449219 52.16703414916992,...
4 u170f2sy 0 0 0 POLYGON ((4.324493408203125 52.16686248779297,...
... ... .. .. .. ...
232720 u171mc2g 0 0 0 POLYGON ((4.472808837890625 52.258358001708984...
232721 u171mc2h 0 0 0 POLYGON ((4.471778869628906 52.25852966308594,...
232722 u171mc2k 0 0 0 POLYGON ((4.4721221923828125 52.25852966308594...
232723 u171mc2s 0 0 0 POLYGON ((4.472465515136719 52.25852966308594,...
232724 u171mc2u 0 0 0 POLYGON ((4.472808837890625 52.25852966308594,...

[232725 rows x 5 columns]
'''

gdf = gpd.GeoDataFrame(o, geometry="geometry", crs="EPSG:4326")
gdf.to_file('sample.gpkg')
```
</details>

## Installation

Expand Down Expand Up @@ -196,7 +300,7 @@ Two sample files have been uploaded to an S3 bucket with `s3:GetObject` public p
- `s3://raster2dggs-test-data/Sen2_Test.tif` (sample Sentinel 2 imagery, 10 bands, rectangular, Int16, LZW compression, ~10x10m pixels, 68.6 MB)
- `s3://raster2dggs-test-data/TestDEM.tif` (sample LiDAR-derived DEM, 1 band, irregular shape with null data, Float32, uncompressed, 10x10m pixels, 183.5 MB)

You may use these for testing. However you can also test with local files too, which will be faster.
You may use these for testing. However you can also test with local files too, which will be faster. A good, small (5 MB) sample image is available [here](https://github.com/mommermi/geotiff_sample).

## Example commands

Expand Down
25 changes: 0 additions & 25 deletions conda_dev.yml

This file was deleted.

Loading