Skip to content

bowers-illinois-edu/wong_bjps_2025_no_data

Repository files navigation

Code-only replication repository for Wong et al 2025 BJPS

This set of files includes all of the code necessary to reproduce all of the results in the paper. It contains no data. The paper itself involves using Canadian census Dissemination Area (DA) data combined with Census Subdivision Data (CSD) data in nearly every step of the analysis. This data could be used to identify survey respondents, and so we do not include it here. If a researcher would like access to this data, we encourage them to contact Cara Wong [email protected]. We are in the process of submitting these files to the ICPSR which has a well established procedure for the use of restricted access data, and, in the future we will direct people to the ICPSR for this purpose.

We recommend that you follow a series of steps in order to reproduce the results in the paper. Feel free to leave an Issue on this Github repository with a description of your problem and one of us will work on helping you out.

Note: We have only used this workflow on a Mac OS computer and a Linux computer. We have not tested it on a Windows computer. We would be happy to publish modifications to enable broader interoperability. Feel free to contact us.

Get an API for use with the cancensus R package to download Canadian Census Data

We use Canadian Census data throughout this paper and we access it using the cancensus R package. That package requires that researchers get an API number or code to identify them when they access the Census Mapper servers. You can put this API number into the Data/get_and_setup_2006_census_data.R and Data/get_and_setup_2016_census_data.R files in order to run that part of the workflow. These data are public. So, no permission is needed to access them.

The public Canadian census data files that we distribute here and that can be re-created using the code as documented in the following lines in Makefile.datasetup are the following:

CENSUS_DATA_FILES = Data/CensusData/2006_Data/census_data_csd_06.rda \
          Data/CensusData/2006_Data/census_data_06.rda

$(CENSUS_DATA_FILES): Data/get_and_setup_2006_census_data.R
 $(RCMD) Data/get_and_setup_2006_census_data.R

Data/CensusData/2016_Data/census_data_16.rda: Data/get_and_setup_2016_census_data.R
 $(RCMD) Data/get_and_setup_2016_census_data.R

As you can see in the R scripts, you will need to install the following packages: here, sf, cancensus, and dplyr.

Install the Gurobi Optimization System

The designmatch package relies on the gurobi constrained optimization software. To replicate the code in this package, you'll need to install gurobi by hand from the Gurobi Quickstart Page. This will involve getting a free academic license, downloading the Gurobi software for your system, installing it on your system, and then installing the gurobi R package into the R installation within this project directory. wong_bjps_2025.Rproj R project directory.

For example, on a Mac we first had to register an account with Gurobi using our academic email address and to activate an academic license for a single-user, then we downloaded a file called gurobi12.0.3_macos_universal2.pkg. Then we double clicked that file using the Mac Finder to install the Gurobi software on our system.

After that installation, we started R within the wong_bjps_2025 directory (for example, we started RStudio using the wong_bjps_2025.Rproj file.) and then we typed the following from the R console

install.packages("/Library/gurobi1203/macos_universal2/R/gurobi_12.0-3_R_4.5.0.tgz",repos=NULL)

Before running any of our files that rely on the Gurobi optimizer, we had to activate the gurobi license using a command like grbgetkey 1234... at the unix command line on our Mac computers (using the Terminal application).

As an alternative to gurobi You could also try the open-source highs package (installing it from within R using install.packages("highs"), changing the arguments to the solver lists that we provide to the nmatch() functions. We haven't tried that solver in this paper so we expect that some of the results would differ.

Install the R Packages Used in the Paper

This paper uses many R packages and we kept our collaboration running smoothly by using the renv system to keep track of packages and their dependencies. You should be able to install all of the packages (except for the gurobi package that has to be installed by hand first) using renv::restore()) from the R command line from an R session started within the root of this project directory (perhaps after starting RStudio using the wong_bjps_2025.Rproj file, or in some other way) and following the instructions.

## If you haven't already installed renv do so using install.packages('renv')
renv::restore()

You may have to work a bit with the renv system to avoid errors. For example, the first time you type renv::restore() it may complain that your version of gurobi is not the same as the one we used originally (12.0-2). So you may need to reinstall the one you want (we had to issue this command first before we used renv on this replication archive and then again after seeing some warnings from renv)

install.packages("/Library/gurobi1203/macos_universal2/R/gurobi_12.0-3_R_4.5.0.tgz",repos=NULL)

And then you may need to see if there are other packages that need reinstalling by typing renv::status() and renv::snapshot().

For example you might see:

> renv::status()
The following package(s) were installed from an unknown source:
- gurobi [12.0-3]
renv may be unable to restore these packages in the future.
Consider reinstalling these packages from a known source (e.g. CRAN).

But, at the end, when you quit R and restart it, you should see an R console saying something like this which indicates that all relevant packages are installed and you are ready to try to build the paper.

R version 4.5.0 (2025-04-11) -- "How About a Twenty-Six"
Copyright (C) 2025 The R Foundation for Statistical Computing
Platform: aarch64-apple-darwin20

...

- Project '~/Documents/PROJECTS/wong_bjps_2025' loaded. [renv 1.1.5]
> 

Build the Manuscript using the GNU Make system

We use the make system to keep track of the dependencies among the files in this project. This means that, if you are using a Mac or Linux machine you should be able to open a Terminal window, cd to the directory containing the replication files, and then to type make Manuscript/manuscript.pdf.

If you are using RStudio we recommend that you open the wong_bjps_2025.Rproj file --- this is an R Project file that will make sure that you are in the correct working directory. It also allows you to use the make system by going to the "Build" tab and clicking on "Build All".

You can see whether your system is ready to use make by typing which make in the Terminal. If you do not see a path to the make command (like /usr/bin/make), then you will need to install GNU Make.

To install GNU Make on a Mac you need to install the "command line tools" via the following command at the unix command line within the Mac terminal xcode-select --install.

Whether you type make Manuscript/manuscript.pdf at the command line or click "Build All" from the RStudio "Build" menu or run each of the files below in order, this will take some time since (1) the nonbipartite matching problems take time to solve and (2) we present Bayesian multilevel model results which require MCMC sampling.

You can see all of the steps required to produce the manuscript.pdf file by typing make -n Manuscript/manuscript.pdf, where you should get output like this (maybe with some files repeated) showing which files should be run and in which order:

## If you need the canadian census data, you might need to download it first
R --no-save --no-restore -f Data/make_working_files.R
R --no-save --no-restore -f Design/dist_mats_data_anyDA_new.R
R --no-save --no-restore -f Design/match_anyDA_new.R
R --no-save --no-restore -f Analysis/analysis_anyDA_new.R
R --no-save --no-restore -f Analysis/supp_desc_new.R > Analysis/supp_desc_new.Rout
R --no-save --no-restore -f Figures_Tables/figures_anyDA_new.R
R --no-save --no-restore -f Analysis/alt_explanations_analysis.R
R --no-save --no-restore -f Figures_Tables/alt_explanations_plot.R
R --no-save --no-restore -f Design/dist_mats_data_DA_new.R
R --no-save --no-restore -f Design/match_DA_new.R
R --no-save --no-restore -f Analysis/analysis_DA_new.R
R --no-save --no-restore -f Figures_Tables/figures_DA_new.R
R --no-save --no-restore -f Data/sameDAdat.R
R --no-save --no-restore -f Analysis/analysis_sameDA.R
R --no-save --no-restore -f Figures_Tables/figures_sameDA.R
R --no-save --no-restore -f Analysis/sameDAviewDA.R
R --no-save --no-restore -f Figures_Tables/figures_sameDAviewDA.R
R --no-save --no-restore -f Design/dist_mats_data_anyDA_Diversity_new.R
R --no-save --no-restore -f Design/match_anyDA_Diversity_new.R
R --no-save --no-restore -f Analysis/analysis_anyDA_Diversity_new.R
R --no-save --no-restore -f Figures_Tables/figures_anyDA_Diversity_new.R
R --no-save --no-restore -f Figures_Tables/plot_pairwise_social_cohesion_anyDA_new.R
R --no-save --no-restore -f Design/match_assess_anyDA_new.R > Design/match_assess_anyDA_new.Rout
R --no-save --no-restore -f Figures_Tables/coefplot_table_anyDA.R
cd Manuscript && latexmk -pdf manuscript.tex

You can see the relationships between the files as laid out in the different Makefiles in this graphic:

workflow

We have not tested our workflow on a Windows computer and we would love advice/pull requests about how to use our Makefile in that context.

About

A version of the Wong BJPS 2025 manuscript without any data.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published