We replicate Table 4.1.1 and Figure 4.1.1 from Mostly Harmless Econometrics using a reproducible research workflow.
Our weapons of choice are:
Snakemaketo manage the build and dependenciesRfor statistical analysis
If you have Snakemake and R installed, navigate your terminal to this directory.
To ensure all R libraries are installed, type
snakemake install_packages
into a your terminal and press RETURN.
If you modify the packages used in this repo, you should rerun this command to store package updates in the REQUIREMENTS.txt.
Type:
snakemake all
into a your terminal and press RETURN
See HELP.txt for explanation of what the Snakemake Rules are doing.
- Install the latest version of
Rby following the instructions here.- You can ignore the RStudio instructions for the purpose of this project.
This project uses Snakemake to execute our research workflow.
You can install snakemake as follows:
-
Install Snakemake from the command line (needs pip, and Python)
pip install snakemake- If you haven't got Python installed click here for instructions
-
Windows and old Mac OSX users: you may need to manually install the
datriepackage if you are getting errors. Using conda, this seems to work best:conda install datrie
Because we want to generate pdf outputs we need two additional bits of software to make that happen:
- If you do not have RStudio installed, you will have to install Pandoc (http://pandoc.org)
- If you do not have LaTeX installed, we recommend that you install TinyTeX (https://yihui.name/tinytex/)
- TinyTeX is a lightweight, portable, cross-platform, and easy-to-maintain LaTeX distribution.
- From inside R:
install.packages('tinytex') tinytex::install_tinytex() # install TinyTeX
Snakemake workflows are a directed acyclic graph (DAG). We can visualize the relationship between the rules (a simplified view of the DAG) in our workflow:
Check out the rules in for various visualizations of the workflow near the bottom of the Snakefile in the 'Snakemake Workflow graphs'.
You will need to install graphviz to run these rules - we have included a rule inside dag.smk to install this for you.
Periodic updates the workflow occur as I find better/simpler ways to do things and as my opinions on best practice evolve. Major changes are tracked in the NEWS file with brief descriptions of the changes implemented.
I'd love to hear your comments, suggestions or installation issues encountered when running the example. Post an issue on Github.
Deer, Lachlan, 2020. "Replication of Angrist and Krueger (1991) with Snakemake : Table 4.1.1 and Figure 4.1.1 from Mostly Harmless Econometrics.
