11# Earthquakes_Italy
22
3- This is the repository for an extended case study of earthquake forecasting in
3+ This repository provides the source code ( _ R _ language) for an extended case study of earthquake forecasting in
44Italy:
55
6- Jonas Brehmer, Kristof Kraus, Tilmann Gneiting, Marcus Herrmann, and
7- Warner Marzocchi (2024+). Comparative evaluation of earthquake forecasting
8- models: An application to Italy.
6+ > Brehmer, J. R., Kraus, K., Gneiting, T., Herrmann, M., and Marzocchi, W. (2025).
7+ > Enhancing the Statistical Evaluation of Earthquake Forecasts—An Application to Italy.
8+ > _ Seismological Research Letters_ .
9+ > doi: [ 10.1785/0220240209] ( https://doi.org/10.1785/0220240209 ) .
10+ > arXiv: [ 2405.10712] ( https://arxiv.org/abs/2405.10712 )
911
10- Parts of this case study and the corresponding code have been used in
12+ Parts of this case study and the corresponding code have been also used in:
1113
12- Jonas Brehmer, Tilmann Gneiting, Marcus Herrmann, Warner Marzocchi, Martin
13- Schlather, and Kirstin Strokorb (2023). Comparative evaluation of point
14- process forecasts.
15-
16- ## Differences to previously reported predictive performance
17-
18- We report slightly different Poisson scores compared to
19-
20- J. Brehmer, T. Gneiting, M. Herrmann, W. Marzocchi, M.
21- Schlather, and K. Strokorb (2023). Comparative evaluation of point
22- process forecasts.
23-
24- Due to numerical inaccuracies three earthquakes had been unnecessarily excluded
25- from analysis (these were the earthquake going with the timestamp 2016-05-30 20:24:20.460,
26- 2016-08-26 04:28:25.890, and 29019-10-25 04:31:38.200).
27- Additionally, we adhere to the CSEP binning which includes lower boundaries to a cell and
28- excludes upper boundaries, whereas Brehmer et al. used the opposite rule. This results in four
29- earthquakes being assigned to a neighboring cells.
30-
31- Likewise, we report different IGPE values compared to
32-
33- M. Herrmann and W. Marzocchi (2023). Maximizing the forecasting skill of an ensemble
34- model. Geophysical Journal International.
35-
36- The reason is again a slightly different binning of earthquakes registered on cell boundaries.
37- While Herrmann et al. also use the CSEP binning, numerical inaccuracies entailed that five
38- earthquakes had been assigned to a wrong cell in terms of the CSEP binning.
14+ > Brehmer, J. R., Gneiting, T., Herrmann, M., Marzocchi, W., Schlather, M., and Strokorb, K. (2024).
15+ > Comparative evaluation of point process forecasts.
16+ > _ Annals of the Institute of Statistical Mathematics 76_ (1), 47–71.
17+ > doi: [ 10.1007/s10463-023-00875-5] ( https://doi.org/10.1007/s10463-023-00875-5 ) .
18+ > arXiv: [ 2103.11884] ( https://arxiv.org/abs/2103.11884 )
3919
4020## Code
4121
@@ -44,7 +24,7 @@ The tables and plots can be reproduced with the following files:
4424- ** data_prep.R** Loads all the data (see below) and pre-processes them. Called at
4525the start of all other main scripts.
4626- ** functions_prep.R** Provides functions to load and preprocess the data
47- - ** precompute.R** pre-compute values for empirical CDFs, Murphy
27+ - ** precompute.R** pre-compute values for empirical CDFs, Murphy
4828diagrams, score component plot and reliability diagram
4929- ** plots_and_tables.R** comprises all the functionality to calculate the values for the
5030tables and to create the plots of the publication apart from simulation study plots
@@ -53,8 +33,10 @@ on its reliability curve
5333- ** sim_study_tests.R** compare CSEP t-Test and Diebold Mariano test on forecasts derived from the LM, FCM and LG model
5434- ** utils.R** define plot theme, and scoring and test functions
5535
36+ The code was developed on a machine with 16GB of RAM. This is required if for the analysis all the models are
37+ kept in memory simultaneously.
5638
57- ## Data
39+ ## Data input
5840
5941- ** models** (four files): Two-dimensional arrays which contain the forecasts of
6042the models. The rows represent different model run times. The columns represent
@@ -67,10 +49,67 @@ degrees and are numbered consecutively.
6749- ** catalog** file: Contains the details (time stamp, magnitude, etc.) of
6850earthquakes in the testing region and during the testing period (but time and
6951space of model forecasts and catalog do not exactly match)
70- - ** climatology** file: Rates for a selection of longitude and latitude values. If
52+ - ** climatology** file: Rates for a selection of longitude and latitude values. If
7153appropriately scaled, they can be understood as climatological forecast which
7254are constant in time. The scaling depends on the assumed number of events in a
73557-day period, see Mail by Warner Marzocchi (08.09.21)
7456
75- The code was developed on a machine with 16GB of RAM. This is required if for the analysis all the models are
76- kept in memory simultaneously.
57+
58+ ## Details on slight differences to previously reported scores
59+
60+ Compared to two previous studies, we report slightly different scores due to correcting the
61+ spatial binning of earthquakes that occurred exactly on grid cell boundaries (bin edges).
62+ These binning discrepancies originate from numerical inaccuracies in floating-point arithmetics.
63+ Of the 262 target earthquakes in the catalog, this affects seven events.
64+
65+
66+ ** 1. Brehmer _ et al._ 2024**
67+
68+ Poisson scores in Table 1–3 slightly differ compared to Table 1 in Brehmer _ et al._ 2024 (see second reference above).
69+
70+ Three target earthquakes previously had been unnecessarily excluded from analysis:
71+ ```
72+ DateTime, Lat, Lon, Depth, Mag
73+ 2016-05-30 20:24:20.460, 42.7, 11.976, 7.9, 4.1
74+ 2016-08-26 04:28:25.890, 42.6, 13.29, 10.9, 4.8
75+ 2019-10-25 04:31:38.200, 39.7, 15.432, 12.1, 4.4
76+ ```
77+ (their _ Latitude_ component falls exactly on a cell boundary)
78+
79+ Additionally, we now adhere to [ _ pyCSEP_ -style binning] ( https://docs.cseptesting.org/reference/generated/csep.utils.calc.bin1d_vec.html )
80+ which includes lower boundaries to a cell and excludes upper boundaries,
81+ whereas Brehmer _ et al._ 2024 used the opposite rule.
82+ This resulted in four earthquakes being incorrectly assigned to a neighboring cell:
83+ ```
84+ DateTime, Lat, Lon, Depth, Mag
85+ 2006-02-27 04:34:01.830, 38.155, 15.2, 9.2, 4.1
86+ 2009-04-06 01:42:49.970, 42.3, 13.429, 10.5, 4.2
87+ 2016-12-09 07:21:50.170, 44.33, 10.5, 7.6, 4.0
88+ 2016-12-11 12:54:52.860, 42.9, 13.113, 8.3, 4.3
89+ ```
90+ (either the _ Latitude_ or _ Longitude_ component falls exactly on a cell boundary)
91+
92+
93+ ** 2. Herrmann & Marzocchi 2023**
94+
95+ IGPE values in Table 3 slightly differ compared to Table 1 in:
96+
97+ > Herrmann, M., and W. Marzocchi (2023).
98+ > Maximizing the forecasting skill of an ensemble model.
99+ > _ Geophysical Journal International 234_ (1), 73–87.
100+ > doi: [ 10.1093/gji/ggad020] ( https://doi.org/10.1093/gji/ggad020 )
101+
102+ (when transforming reported values to use _ ETAS_LM_ as reference instead of _ SMA_ ).
103+
104+ Herrmann & Marzocchi 2023 used [ _ pyCSEP_ 's binning function] ( https://docs.cseptesting.org/reference/generated/csep.utils.calc.bin1d_vec.html ) ,
105+ but it previously [ didn't properly account for floating point precision] ( https://github.com/SCECcode/pycsep/issues/255 ) ,
106+ resulting in five earthquakes being incorrectly assigned to a neighboring cell:
107+ ```
108+ DateTime, Lat, Lon, Depth, Mag
109+ 2009-04-06 01:42:49.970, 42.3, 13.429, 10.5, 4.2
110+ 2016-05-30 20:24:20.460, 42.7, 11.976, 7.9, 4.1
111+ 2016-08-26 04:28:25.890, 42.6, 13.29, 10.9, 4.8
112+ 2016-12-11 12:54:52.860, 42.9, 13.113, 8.3, 4.3
113+ 2019-10-25 04:31:38.200, 39.7, 15.432, 12.1, 4.4
114+ ```
115+ (only their _ Latitude_ component falls exactly on a cell boundary; those five events were already mentioned above)
0 commit comments