Skip to content

Commit 6869d42

Browse files
committed
fixes #6
1 parent 9d21219 commit 6869d42

File tree

9 files changed

+660541
-648493
lines changed

9 files changed

+660541
-648493
lines changed

README.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@
33
[doi]: https://doi.org/10.17605/OSF.IO/U8DC3
44

55

6-
# Monthly municipal-level homicide rates in Mexico from January 2000 to December 2021
6+
# Monthly municipal-level homicide rates in Mexico from January 2000 to December 2022
77

8-
Data on crude monthly municipal-level homicide rates is available in `mexico-muni-month-homicide-rates-2000-2021.csv`; state-level aggregations are available in `mexico-state-month-homicide-rates-2000-2021.csv`. Note that both files use `|` as the separator.
8+
Data on crude monthly municipal-level homicide rates is available in `mexico-muni-month-homicide-rates-2000-2022.csv`; state-level aggregations are available in `mexico-state-month-homicide-rates-2000-2022.csv`. Note that both files use `|` as the separator.
99

1010
If you use `R` you can use `readr` package to load the file and specify the separator with `readr::read_delim("PATH_TO_FILE", delim = "|")`
1111

@@ -18,24 +18,24 @@ To replicate the results, first run the `import` sub-task within the `census-dat
1818
- iter_00_cpv2010.csv, retreived from https://www.inegi.org.mx/programas/ccpv/2010/#Datos_abiertos (download data for "Estados Unidos Mexicanos")
1919
- conjunto_de_datos_iter_00CSV20.csv, retreived from https://www.inegi.org.mx/programas/ccpv/2020/#Datos_abiertos (download data for "Estados Unidos Mexicanos")
2020

21-
Next run the `interpolate` sub-task within the `census-task` using the `Makefile` in the `census-data/interpolate` directory. This task uses the population counts from the `import` sub-task to linearly interpolate mid-year (1 July) population counts for each municipality from 2000-2021.
21+
Next run the `interpolate` sub-task within the `census-task` using the `Makefile` in the `census-data/interpolate` directory. This task uses the population counts from the `import` sub-task to linearly interpolate mid-year (1 July) population counts for each municipality from 2000-2022.
2222

23-
After running both sub-tasks in the `census-data` task, run the sub-tasks in the `deaths-data` directory. Again, this task begins with an `import` sub-task, which you can run using the `Makefile`. This task reads in death certificate files published in `.dbf` format by INEGI and writes their contents to `.csv` files. This task expects death certificate files from 2000-2021 in a sub-directory called `death-certificates` within the top-level `data` directory. These files can be downloaded from https://www.inegi.org.mx/programas/mortalidad/#Microdatos.
23+
After running both sub-tasks in the `census-data` task, run the sub-tasks in the `deaths-data` directory. Again, this task begins with an `import` sub-task, which you can run using the `Makefile`. This task reads in death certificate files published in `.dbf` format by INEGI and writes their contents to `.csv` files. This task expects death certificate files from 2000-2022 in a sub-directory called `death-certificates` within the top-level `data` directory. These files can be downloaded from https://www.inegi.org.mx/programas/mortalidad/#Microdatos.
2424

25-
Next, run the `homicide-counts` sub-task using the `Makefile`. This task uses the death certificate files imported in the `deaths-data/import` task to generate counts of homicide deaths in each municipality in each month from January 2000-December 2021. The cause of death classification file, found in the `hand` subdirectory follows the cause of death classification scheme used by [Elo, Beltrán-Sánchez and Macinko (2014)](https://pubmed.ncbi.nlm.nih.gov/24554793/). Note that deaths that occurred outside of Mexico and deaths that were missing cause of death, country of occurrence, or municipality of occurrence were excluded from these calculations. We also opted to use data from the location where the death occurred rather than the location where the individual was from because this information was more complete for homicides. One day we might impute this information and recalculate the counts accordingly.
25+
Next, run the `homicide-counts` sub-task using the `Makefile`. This task uses the death certificate files imported in the `deaths-data/import` task to generate counts of homicide deaths in each municipality in each month from January 2000-December 2022. The cause of death classification file, found in the `hand` subdirectory follows the cause of death classification scheme used by [Elo, Beltrán-Sánchez and Macinko (2014)](https://pubmed.ncbi.nlm.nih.gov/24554793/). Note that deaths that occurred outside of Mexico and deaths that were missing cause of death, country of occurrence, or municipality of occurrence were excluded from these calculations. We also opted to use data from the location where the death occurred rather than the location where the individual was from because this information was more complete for homicides. One day we might impute this information and recalculate the counts accordingly.
2626

27-
Finally, run the top-level `homicide-rates` task to calculate the monthly municipal-level crude homicide rates for January 2000-December 2021.
27+
Finally, run the top-level `homicide-rates` task to calculate the monthly municipal-level crude homicide rates for January 2000-December 2022.
2828

2929
If you use this data please use the BibTeX entry below or see the [OSF repository](https://osf.io/u8dc3/) for other citation formats:
3030

3131
```
3232
@misc{Gargiulo_Aburto_Floridi_2023,
33-
title={Monthly municipal-level homicide rates in Mexico (January 2000–December 2021)},
33+
title={Monthly municipal-level homicide rates in Mexico (January 2000–December 2022)},
3434
url={osf.io/u8dc3},
3535
DOI={10.17605/OSF.IO/U8DC3},
3636
publisher={OSF},
3737
author={Gargiulo, Maria and Aburto, José Manuel and Floridi, Ginevra},
38-
year={2023},
39-
month={Feb}
38+
year={2024},
39+
month={March}
4040
}
4141
```

code/census-data/interpolate/src/interpolate.R

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ interpolation_wrapper <- function(ent) {
3333

3434
# use 1 July as mid-year date
3535
mid_years_1 <- seq(ymd(20000701), ymd(20090701), by = "year")
36-
mid_years_2 <- seq(ymd(20100701), ymd(20210701), by = "year") # apply same slope through 2021
36+
mid_years_2 <- seq(ymd(20100701), ymd(20220701), by = "year") # apply same slope through 2022
3737

3838
estimates_1 <- map_dfr(.x = mid_years_1,
3939
~interpolate_population(pop_1 = ent$total_pop_2000,

code/deaths-data/homicide-counts/Makefile

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,12 @@ DEATHS := $(wildcard $(HERE)/code/deaths-data/import/output/DEFUN*.csv)
99

1010
.PHONY: all clean
1111

12-
all: output/muni-month-homicides-2000-2021.csv
12+
all: output/muni-month-homicides-2000-2022.csv
1313

1414
clean:
1515
-rm output/*
1616

17-
output/muni-month-homicides-2000-2021.csv: \
17+
output/muni-month-homicides-2000-2022.csv: \
1818
src/calculate.R \
1919
$(DEATHS)
2020
-mkdir output

code/deaths-data/homicide-counts/src/calculate.R

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -45,15 +45,15 @@ homicide_codes <- cod_mapping %>%
4545
filter(cod_group == "Homicides")
4646

4747
# collect all death certificate file paths
48-
years <- 2000:2021
48+
years <- 2000:2022
4949
input_files <- glue("{args$import_stub}/DEFUN{years}.csv")
5050

5151
# read in and concatenate records from all files
5252
deaths_data <- map_dfr(input_files, read_file)
5353

5454
homicide_deaths <- deaths_data %>%
5555
# filter out deaths that occurred outside of time period or are missing year information
56-
filter(between(year, 2000, 2021)) %>%
56+
filter(between(year, 2000, 2022)) %>%
5757
# filter out deaths missing month information
5858
filter(month != 99) %>%
5959
# filter out deaths that occurred outside of Mexico or are missing state info

code/homicide-rates/Makefile

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,29 +8,29 @@ HERE := $(shell git rev-parse --show-toplevel)
88

99
.PHONY: all clean
1010

11-
all: output/mexico-muni-month-homicide-rates-2000-2021.csv \
12-
output/mexico-state-month-homicide-rates-2000-2021.csv
11+
all: output/mexico-muni-month-homicide-rates-2000-2022.csv \
12+
output/mexico-state-month-homicide-rates-2000-2022.csv
1313

1414
clean:
1515
-rm output/*
1616

17-
output/mexico-muni-month-homicide-rates-2000-2021.csv: \
17+
output/mexico-muni-month-homicide-rates-2000-2022.csv: \
1818
src/muni-calculate.R \
19-
$(HERE)/code/deaths-data/homicide-counts/output/muni-month-homicides-2000-2021.csv \
19+
$(HERE)/code/deaths-data/homicide-counts/output/muni-month-homicides-2000-2022.csv \
2020
$(HERE)/code/census-data/interpolate/output/population-estimates.csv
2121
-mkdir output
2222
Rscript --vanilla $< \
23-
--homicides_data=$(HERE)/code/deaths-data/homicide-counts/output/muni-month-homicides-2000-2021.csv \
23+
--homicides_data=$(HERE)/code/deaths-data/homicide-counts/output/muni-month-homicides-2000-2022.csv \
2424
--population_estimates=$(HERE)/code/census-data/interpolate/output/population-estimates.csv \
2525
--output=$@
2626

27-
output/mexico-state-month-homicide-rates-2000-2021.csv: \
27+
output/mexico-state-month-homicide-rates-2000-2022.csv: \
2828
src/state-calculate.R \
29-
$(HERE)/code/deaths-data/homicide-counts/output/muni-month-homicides-2000-2021.csv \
29+
$(HERE)/code/deaths-data/homicide-counts/output/muni-month-homicides-2000-2022.csv \
3030
$(HERE)/code/census-data/interpolate/output/population-estimates.csv
3131
-mkdir output
3232
Rscript --vanilla $< \
33-
--homicides_data=$(HERE)/code/deaths-data/homicide-counts/output/muni-month-homicides-2000-2021.csv \
33+
--homicides_data=$(HERE)/code/deaths-data/homicide-counts/output/muni-month-homicides-2000-2022.csv \
3434
--population_estimates=$(HERE)/code/census-data/interpolate/output/population-estimates.csv \
3535
--output=$@
3636

code/homicide-rates/src/muni-calculate.R

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,11 @@ pacman::p_load(argparse, here, dplyr, readr, tidyr, lubridate)
1212

1313
parser <- ArgumentParser()
1414
parser$add_argument("--homicides_data",
15-
default = here::here("code/deaths-data/homicide-counts/output/muni-month-homicides-2000-2021.csv"))
15+
default = here::here("code/deaths-data/homicide-counts/output/muni-month-homicides-2000-2022.csv"))
1616
parser$add_argument("--population_estimates",
1717
default = here::here("code/census-data/interpolate/output/population-estimates.csv"))
1818
parser$add_argument("--output",
19-
default = "output/mexico-muni-month-homicide-rates-2000-2021.csv")
19+
default = "output/mexico-muni-month-homicide-rates-2000-2022.csv")
2020

2121
args <- parser$parse_args()
2222

@@ -45,9 +45,9 @@ population <- read_delim(args$population_estimates, delim = "|") %>%
4545
select(-month, -day, -est_date)
4646

4747
# start by creating a grid with all municipalities and months between January
48-
# 2000 and December 2021
48+
# 2000 and December 2022
4949
munis <- union(homicides$ent_mun, population$ent_mun)
50-
months <- seq(ym("200001"), ym("202112"), by = "month")
50+
months <- seq(ym("200001"), ym("202212"), by = "month")
5151

5252
homicide_rates <- crossing(munis, months) %>% # expand grid
5353
mutate(year = as.numeric(year(months)),

code/homicide-rates/src/state-calculate.R

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ pacman::p_load(argparse, here, dplyr, readr, tidyr, lubridate, assertr, stringr)
1212

1313
parser <- ArgumentParser()
1414
parser$add_argument("--homicides_data",
15-
default = here::here("code/deaths-data/homicide-counts/output/muni-month-homicides-2000-2021.csv"))
15+
default = here::here("code/deaths-data/homicide-counts/output/muni-month-homicides-2000-2022.csv"))
1616
parser$add_argument("--population_estimates",
1717
default = here::here("code/census-data/interpolate/output/population-estimates.csv"))
1818
parser$add_argument("--output",
@@ -61,9 +61,9 @@ population <- population %>%
6161
ungroup()
6262

6363
# start by creating a grid with all states and months between January
64-
# 2000 and December 2021
64+
# 2000 and December 2022
6565
states <- union(homicides$cve_ent, population$cve_ent)
66-
months <- seq(ym("200001"), ym("202112"), by = "month")
66+
months <- seq(ym("200001"), ym("202212"), by = "month")
6767

6868
homicide_rates <- crossing(states, months) %>% # expand grid
6969
mutate(year = as.numeric(year(months)),

0 commit comments

Comments
 (0)