Including DNN Extension

lorinanthony · lorinanthony · commit dc01de823a9a · 2019-01-31T01:09:19.000-05:00
diff --git a/Deep Learning/README.md b/Deep Learning/README.md
@@ -1,6 +1,6 @@
 # Variable Importance for Bayesian Neural Networks
 
-Here, we demonstrate how to implement RATE with the Bayesian neural network architectures as described in [Ish-Horowicz et al. (2019)](https://arxiv.org/abs/1901.09839). The `Notebooks` directory contains notebooks used to generate each of the plots in the paper. These are meant to serve as examples of how to build and train Bayesian neural networks and determine variable importance for its input features.
+Here, we demonstrate how to implement RATE with the Bayesian neural network architecture described in [Ish-Horowicz et al. (2019)](https://arxiv.org/abs/1901.09839). The `Notebooks` directory contains notebooks used to generate each of the plots in the paper. These are meant to serve as examples of how to build and train Bayesian neural networks and determine variable importance for its input features.
 
 The source code in `src` organized as follows:
 * `BayesNN.py` contains a class implementing the Bayesian neural network.
diff --git a/README.md b/README.md
@@ -1,7 +1,7 @@
 # Variable Prioritization for Black Box Methods via RelATive cEntrality (RATE)
-Our ability to build good predictive models has, in many cases, outstripped our ability to extract interpretable information about the relevance of the input covariates being used. The central aim of [Crawford et al. (2019)](https://arxiv.org/abs/1801.07318) is to assess variable importance after having fit a nonlinear or nonparametric (Bayesian) model. In this work, we propose a new "RelATive cEntrality" (RATE) measure as an interpretable way to summarize the importance of covariates. By assessing entropy in the joint posterior distribution via Kullback-Leibler divergence (KLD), we can correctly prioritize candidate variables which are not just marginally important, but also those whose associations stem from a significant covarying relationship with other variables in the data. We demonstrate our proposed approach in the context of statistical genetics, where the discovery of variants that are involved in nonlinear interactions is of particular interest. In this repository, we focus on illustrating RATE through Gaussian process (GP) regression; although, methodological innovations can easily be applied to other machine learning-type methods such as Bayesian kernel ridge (BKR) regression or (deep) neural networks. It is well known that nonlinear models often exhibit greater predictive accuracy than linear models, particularly for outcomes generated by complex data architectures. With simulations and real data examples, we show that applying RATE enables an explanation for this improved performance.
+Our ability to build good predictive models has, in many cases, outstripped our ability to extract interpretable information about the relevance of the input covariates being used. The central aim of [Crawford et al. (2019)](https://arxiv.org/abs/1801.07318) and [Ish-Horowicz et al. (2019)](https://arxiv.org/abs/1901.09839) is to assess variable importance after having fit a nonlinear or nonparametric (Bayesian) model. In this work, we propose a new "RelATive cEntrality" (RATE) measure as an interpretable way to summarize the importance of covariates. By assessing entropy in the joint posterior distribution via Kullback-Leibler divergence (KLD), we can correctly prioritize candidate variables which are not just marginally important, but also those whose associations stem from a significant covarying relationship with other variables in the data. We demonstrate our proposed approach in the context of statistical genetics, where the discovery of variants that are involved in nonlinear interactions is of particular interest. In the `Tutorials` directory, we focus on illustrating RATE through Gaussian process (GP) regression; although, methodological innovations can easily be applied to other machine learning-type methods such as (deep) neural networks as demonstrated in the `Deep Learning` directory. It is well known that nonlinear methods often exhibit greater predictive accuracy than linear models, particularly for outcomes generated by complex data architectures. With simulations and real data examples, we show that applying RATE enables an explanation for this improved performance.
 
-RATE is implemented as a set of parallelizable routines, which can be carried out within an R environment. [Supplementary Material](http://lcrawlab.com/Papers/RATE_SI.pdf) for Crawford et al. (2019) can be found on our lab website.
+RATE is implemented as a set of parallelizable routines, which can be carried out within an R environment.  Detailed derivations of the algorithm, which utilizes low-rank matrix factorizations for a more practical implementation, are derived in [Supplementary Material](http://lcrawlab.com/Papers/RATE_SI.pdf) of Crawford et al. (2019).
 
 ### R Packages Required for RATE
 The RATE function software requires the installation of the following R libraries:
@@ -37,11 +37,15 @@ For macOS users, the Xcode Command Line Tools include a GCC compiler. Instructio
 
 ### Demonstrations and Tutorials for Running RATE
 
-We provide a few example scripts that demonstrate how to conduct variable selection in nonlinear models with RATE measures. Here, we consider a simple (and small) genetics example where we simulate genotype data for _n_ individuals with _p_ measured genetic variants. We then randomly select a small number of these predictor variables to be causal and have true association with the generated (continuous) phenotype. These scripts are meant to illustrate proof of concepts and specifically walk through: (1) how to compute a covariance matrix using the Gaussian kernel function; (2) how to fit a standard Bayesian Gaussian process (GP) regression model; and (3) prioritizing variables via their first, second, third, and fourth order distributional centrality.
+In the `Tutorials` directory, we provide a few example scripts that demonstrate how to conduct variable selection in nonlinear models with RATE measures. Here, we consider a simple (and small) genetics example where we simulate genotype data for _n_ individuals with _p_ measured genetic variants. We then randomly select a small number of these predictor variables to be causal and have true association with the generated (continuous) phenotype. These scripts are meant to illustrate proof of concepts and specifically walk through: (1) how to compute a covariance matrix using the Gaussian kernel function; (2) how to fit a standard Bayesian Gaussian process (GP) regression model; and (3) prioritizing variables via their first, second, third, and fourth order distributional centrality.
+
+In the `Deep Learning` directory, we demonstrate how to implement RATE with Bayesian neural network architectures. Notebooks are provided to give explicit details on training procedures and how to determine variable importance for the input features of networks.
 
 ### Relevant Citations
 L. Crawford, S.R. Flaxman, D.E. Runcie, and M. West (2019). Variable prioritization in nonlinear black box methods: a genetic association case study. _Annals of Applied Statistics_. In Press.
 
+J. Ish-Horowicz*, D. Udwin*, S.R. Flaxman, S.L. Filippi, and L. Crawford (2019). Interpreting deep neural networks through variable importance. _arXiv_. 1901.09839.
+
 ### Questions and Feedback
 For questions or concerns with the RATE functions, please contact [Lorin Crawford](mailto:lorin_crawford@brown.edu).