Skip to content

Commit 2b52d69

Browse files
authored
Create CovarianceBinary.md
1 parent f38dde6 commit 2b52d69

File tree

1 file changed

+17
-0
lines changed

1 file changed

+17
-0
lines changed
+17
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
## Efficiently computing covariance matrices for binary data
2+
3+
Let **X** be P-dimensional binary vector. Now suppose you have a sample **D** = (**X1**, **X2**, ..., **Xn**). The task is to compute the P x P covariance matrix of **X** from the sample.
4+
5+
In R you could presumably do
6+
```R
7+
cov(D)
8+
```
9+
The problem is that in many cases this will give you a matrix that is not positive-definite. One way to fix the problem is to realise that for binary variable we can compute the covariance between Xi and Xj by computing
10+
```
11+
cov_ij = p_ij - p_i*p_j.
12+
```
13+
The problem then becomes doing this efficiently. The task is embarrassingly parallelisable, but coding in R is still slow.
14+
15+
Your job is to take the implementation of `binary_cov_matrix()` in [here](https://github.com/maxbiostat/BinaryMarkovChains/blob/main/R/binary_multiESS.R) and make it go vrum vrum. I reckon a simple re-coding in Rcpp should do the trick.
16+
17+
**Applications**: this can be used in estimating the efficiency of Markov chain Monte Carlo algorithms in binary spaces.

0 commit comments

Comments
 (0)