Skip to content

Commit 7d1bee8

Browse files
committed
Upload to github
1 parent b3471c2 commit 7d1bee8

18 files changed

+3739
-0
lines changed

.gitignore

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# Jupyter Notebook
2+
.ipynb_checkpoints
3+
4+
# Byte-compiled / optimized / DLL files
5+
__pycache__/
6+
*.py[cod]
7+
*$py.class
8+
9+
# Old scripts
10+
obsolete_scripts/
11+
12+
# output
13+
notebooks/figures/
14+
15+
# Folder storing damage statistic analysis.
16+
create-damage-function/

README.md

Lines changed: 129 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,129 @@
1+
# Flood risk assessment of linear structures: A Gaussian Process framework.
2+
3+
This repository contains a complete set of scripts and notebooks applied to estimate the *Expected Annual Damage* (EAD) to roads asociated with river floods (in Portugal). The framework may be adapted to other regions and other types of hazards, if represented by some hazard intensity parameters associated with a damage function. To estimate the uncertainty associated with the EAD, the damage is modeled as a spatial random field. The spatial correlation of this random field is decisive to the uncertainty of the EAD. To this end, Gaussian Processes are used to embed the uncertainty associated with the flood damage function with a spatial correlation.
4+
5+
The methodology is described in the article *Uncertainty in flood risk assessment of linear structures: Why correlation matters*.
6+
7+
## Data
8+
All the data used for the analysis is freely available. Where to find it, and how to download it is described below.
9+
10+
## Damage Assessment
11+
The following instructions may be used to rerun the analysis from *Uncertainty in flood risk assessment of linear structures: Why correlation matters*. The assessment is implemented in a series of steps.
12+
13+
Make sure you have a working python environment. See the below notes. Select a folder for storing all the data associated with the analysis.
14+
```bash
15+
export DATADIR=/PATH/TO/FOLDER
16+
```
17+
18+
### 1. Load the floodmaps.
19+
To download and rasterize the floodmaps for Portugal use the script `load_floodmaps.py`. I.e.,
20+
```bash
21+
python load_floodmaps.py
22+
```
23+
The script is written with the purpose of loading a set of floodmaps for portugal from [sniamb](https://sniamb.apambiente.pt/).
24+
Output folder is specified as a subfolder, named `floodmaps` of the `DATADIR`.
25+
Before running the script ensure that you have GDAL available from the commandline.
26+
The script calls `gdal_rasterize` to rasterize and `gdal_merge` to merge the floodmaps into a single multiband raster `feature.tif`.
27+
28+
### 2. Filter elements from OSM and assign floodmap features to road segments.
29+
The script `assign_raster_to_osm_elements.py` applies [osmium](https://osmcode.org/pyosmium/) to load elements from Open Street Map. For each element raster values are assigned as features, named according to band description in the raster file. The raster is not loaded into memory. This makes the proceedure a bit slower, but enables the application of large rasters. First download the OSM extracts for the selected region from [geofabrik](https://download.geofabrik.de/)
30+
```bash
31+
[DATADIR]$ wget https://download.geofabrik.de/europe/portugal-latest.osm.pbf
32+
```
33+
Next,
34+
```bash
35+
$ python assign_raster_to_osm_elements.py $DATADIR/floodmaps/merged_floodmaps/features.tif $DATADIR/portugal-latest.osm.pbf assigned.json
36+
```
37+
The selection of elements is hardcoded, but may easily changed. (As an improvement one may specify a filter in a json-file using similar code as the one found in the script `filtering.py`)
38+
39+
### 3. Generate random fields for damage sampling.
40+
It is computationally expensive to generate random fields. To make sure that
41+
we don't need to sample values on entire map, but only where values are needed,
42+
the first script generates a mask.
43+
```bash
44+
$ python create_intersect.py $DATADIR/assigned.json [epsg_NR] $DATADIR/coords.csv $DATADIR/intersects.tif [xres] [yres]
45+
```
46+
- `assigned.json` is the geojson generated in step 2
47+
- `epsg_NR` is the EPSG code of the reference system assigned to `intersect.tif`. For Portugal we apply 27429.
48+
- `coords.csv` Filename/path to generated CSV listing all the points associated with each segment in `assigned.json`. For each point it records open street map id (id), position of the point in the segment (coo_nr), position (x, y) and position in raster (row, col).
49+
- `intersect.tif` Filename/path of the generated mask.
50+
- `xres yres` is the spatial resolution of the generated mask.
51+
52+
The second script generates random fields with values at the points specified
53+
by mask. The spatial field is a Gaussian Process with mean zero and the
54+
covariance kernel
55+
$$
56+
k(x,y) = \exp\left(\frac{|x-y|}{\ell}\right)
57+
$$
58+
where $\ell$ is known as the decorrelation length and measured in meter. To generate the random fields:
59+
60+
```bash
61+
$ python gaussian-random-field.py $DATADIR/intersects.tif $DATADIR/random-fields/l-[l]/random_field.tif --add_mask N l
62+
```
63+
- `intersect.tif` is the raster mask generated in previous step. Specifies which the part pf the raster for which to generate the field. Tested with Type=Byte.
64+
- `random_field.tif` is the generated random field. A number is appended before `.tif` so as to obtain `random_field-1.tif`.
65+
- `--add_mask` writes mask to each random field. This is particularily useful to display the fields in qgis.
66+
- `N` is the number of fields to be written. Each sample is a simple matrix multiplication, its the constuction of the matrix that takes time.
67+
- `l` is the decorrolation length applied in the kernel.
68+
69+
To merge all the random fields into a `.vrt` file, apply
70+
```bash
71+
[DATADIR/random-fields/l-[l]]$ gdalbuildvrt -separate -o random_fields.vrt random_field-*.tif
72+
```
73+
74+
### 4. Assigning region codes.
75+
If you want to aggregate values on a regional level, you will need to assign a region id to the features.
76+
First download [region codes](https://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/administrative-units-statistical-units/nuts), and store them under `[DATADIR]/nuts`. These are shapefiles. Generating the raster from a shapefile may be done using `gdal_rasterize`, and filtering the the shapefile to include the regions of interest may be performed applying `filtering.py`. That is,
77+
```bash
78+
# filter region and add counter id.
79+
python filter.py -c id -f nuts_filter.json $DATADIR/nuts/NUTS_RG_03M_2021_4326.shp $DATADIR/nuts/portugal_nuts.shp
80+
# convert coordinates.
81+
ogr2ogr -t_srs EPSG:27429 $DATADIR/nuts/portugal_nuts27429.shp $DATADIR/nuts/portugal_nuts.shp
82+
# write regions to raster. First get extent of intersect
83+
gdalinfo $DATADIR/intersects.tif -json | jq .cornerCoordinates
84+
gdal_rasterize -a id -tr 100 100 -te 464852 4096019 630794 4632077 $DATADIR/nuts/portugal_nuts27429.shp $DATADIR/nuts/portugal_nuts.tif
85+
```
86+
Finally, having the region codes in raster format, apply the script `assign_field_from_raster.py` e.g.,
87+
```bash
88+
python assign_field_from_raster.py -c $DATADIR/assigned.json $DATADIR/nuts/portugal_nuts.tif $DATADIR/region-assigned.json region
89+
```
90+
Note: As an alternative approach to assigning region to each element in the `assigned.json` one may filter the geopandas dataframe in the notebook `estimate-damage.ipynb` instead. This is not implemented, but probably a simpler and more flexible approach.
91+
92+
93+
### 5. Assign damage to elements.
94+
The rest of the analysis is carried out in jupyter notebooks. The reason is that there are many choices along the way, and the notebooks serves as documentation on the analysis. Further, the investigation of results are easily augmented in this setting. The first part is concerned with the fitting of a damage function. This is done in `damage-function.ipynb`. Then, the final analysis is done in the notebook `estimate-damage.ipynb`.
95+
96+
## Notes on working environment.
97+
The python version is set in .python-version as used by pyenv. Use the
98+
requirements.txt to create a local environment. Path to the environment can be
99+
set in venv.sh (used to activate the environment python shell).
100+
101+
### Dependencies.
102+
103+
Make sure you have GDAL installed. To install [python bindings for gdal](https://mothergeo-py.readthedocs.io/en/latest/development/how-to/gdal-ubuntu-pkg.html#install-gdal-for-python):
104+
```bash
105+
sudo apt-get install libgdal-dev
106+
```
107+
Next,
108+
```bash
109+
export CPLUS_INCLUDE_PATH=/usr/include/gdal
110+
export C_INCLUDE_PATH=/usr/include/gdal
111+
```
112+
and then
113+
```bash
114+
pip install GDAL==version
115+
```
116+
where version can be found by `gdal-config --version`. Here is a (possibly incomplete) collection of libraries I had to install:
117+
118+
```bash
119+
libsuitesparse-dev build-essential cmake libboost-dev libexpat1-dev zlib1g-dev libbz2-dev
120+
```
121+
122+
## Funding
123+
The development of the framework has received funding from the European Community’s H2020 Program MG-7-1-2017, Resilience to extreme (natural and human-made) events, under Grant Agreement number: 769255—"GIS-based infrastructure management system for optimized response to extreme events of terrestrial transport networks (SAFEWAY)".
124+
The support is gratefully acknowledged.
125+
126+
## Further development.
127+
The current framework does not estimate uncertainty with respect to the flood intensity parameters. The framework should be adapted to also consider uncertainty associated with flood maps.
128+
129+
Some straight OSM road segments have large distances between their spatial coordinates. A refinment step to obtain an upper bound on distance beetween segment coordinates may be implemented as part of the `assign_raster_to_osm_elements.py` script.

assign_field_from_raster.py

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
import json
2+
import os
3+
import numpy as np
4+
import logging
5+
import rasterio
6+
import argparse
7+
from rasterio.transform import rowcol
8+
from rasterio.windows import Window
9+
from pyproj import Proj, Transformer
10+
11+
from config import LOG_LEVEL, LOG_FORMAT, LOG_DIR
12+
13+
logging.basicConfig(level=LOG_LEVEL, format=LOG_FORMAT)
14+
15+
logfile = "assign_field_from_raster-log.txt"
16+
17+
def main():
18+
19+
description_str = """
20+
Assigns spatial field to properties, spatial_fields from raster to geojson. If spatial field allready exists, then append values.
21+
"""
22+
parser = argparse.ArgumentParser(prog="assign_field_from_raster.py",
23+
description=description_str)
24+
parser.add_argument('input_geojson', type=str,
25+
help='geojson containing features to assign values to.')
26+
parser.add_argument('raster', type=str,
27+
help='raster file')
28+
parser.add_argument('assigned_geojson', type=str,
29+
help='out file')
30+
parser.add_argument('field_name', type=str,
31+
help='Name of the assigned field.')
32+
parser.add_argument('-c','--categorical', action='store_true',
33+
help="Raster contains cathegorical value to be assigned to each feature")
34+
args = parser.parse_args()
35+
36+
# Add new file handler to logger.
37+
file_handler = logging.FileHandler(filename=os.path.join(LOG_DIR, logfile))
38+
log_formatter = logging.Formatter(fmt=LOG_FORMAT)
39+
file_handler.setFormatter(log_formatter)
40+
file_handler.setLevel(LOG_LEVEL)
41+
logging.getLogger().addHandler(file_handler)
42+
43+
with open(args.input_geojson, 'r') as file:
44+
logging.info("Reads elements from file: {}".format(args.input_geojson))
45+
elements = json.load(file)
46+
47+
assigned = {
48+
"type": "FeatureCollection",
49+
"features": [],
50+
}
51+
52+
with rasterio.open(args.raster) as dataset:
53+
logging.info("Reads data from raster: {}".format(dataset.name))
54+
rastercoords_from_lonlat = Transformer.from_proj(
55+
Proj('epsg:4326'), # source coordinates (lonlat)
56+
Proj(dataset.crs), # target coordinates
57+
always_xy=True # Use easting-northing, longitude-latitude order of coordinates.
58+
)
59+
# transforms lat-lon to raster row-col of raster
60+
# array = dataset.read(masked=True, out_dtype=np.float32)
61+
62+
# rowcol(region_dataset.transform, -9.73363, 36.94755)
63+
nr_of_assigned_features = 0
64+
for feature in elements['features']:
65+
if nr_of_assigned_features%100 == 0:
66+
logging.info("Assigned features: {}".format(nr_of_assigned_features))
67+
68+
coords = feature["geometry"]["coordinates"]
69+
xs, ys = rastercoords_from_lonlat.transform(*zip(*coords))
70+
rows, cols = rowcol(dataset.transform, xs, ys)
71+
# rows, cols = rowcol(dataset.transform, *zip(*coords))
72+
window, window_rows, window_cols = get_window(rows, cols)
73+
array = dataset.read(out_dtype=np.float64, window=window)
74+
75+
try:
76+
spatial_field_list = np.round(array[:, window_rows, window_cols], 3).tolist()
77+
78+
except IndexError as error:
79+
logging.warning("{} - OSM Segment is outside of raster bounds.")
80+
contained_in_raster = [0 <= row < dataset.shape[0] and 0 <= col < dataset.shape[1] for (row, col) in
81+
zip(rows, cols)]
82+
rows = [row for (contained, row) in zip(contained_in_raster, rows) if contained]
83+
cols = [col for (contained, col) in zip(contained_in_raster, cols) if contained]
84+
window, window_rows, window_cols = get_window(rows, cols)
85+
array = dataset.read(out_dtype=np.float64, window=window)
86+
87+
# append zero values outside of raster bounds.
88+
padded_array = np.zeros([dataset.count, len(contained_in_raster)])
89+
padded_array[:, contained_in_raster] = array[:, window_rows, window_cols]
90+
spatial_field_list = np.round(padded_array, 3).tolist()
91+
92+
if not args.categorical:
93+
if args.field_name in feature["properties"]["spatial_fields"]:
94+
# Append to existing values
95+
feature["properties"]["spatial_fields"][args.field_name].extend(spatial_field_list)
96+
else:
97+
# create new property
98+
feature["properties"]["spatial_fields"][args.field_name] = spatial_field_list
99+
else:
100+
# categorical value. Assign most frequent value as property.
101+
feature["properties"][args.field_name] = int(np.bincount(spatial_field_list[0]).argmax())
102+
assigned["features"].append(feature)
103+
nr_of_assigned_features += 1
104+
logging.info("Done processing features. Updated {} features".format(nr_of_assigned_features))
105+
with open(args.assigned_geojson, 'w') as outfile:
106+
json.dump(assigned, outfile)
107+
logging.info("Wrote to file: {}".format(args.assigned_geojson))
108+
109+
def get_window(rows, cols):
110+
# find window
111+
col_off = min(cols)
112+
row_off = min(rows)
113+
width = max(cols) - min(cols) + 1
114+
height = max(rows) - min(rows) + 1
115+
116+
window_rows = [row - row_off for row in rows]
117+
window_cols = [col - col_off for col in cols]
118+
119+
return Window(col_off, row_off, width, height), window_rows, window_cols
120+
121+
122+
if __name__ == "__main__":
123+
main()

0 commit comments

Comments
 (0)