Skip to content

Commit 316e977

Browse files
Fixed issues #6, #7, #8, #9, #10, #11 flagged by J
1 parent e32b0a7 commit 316e977

1 file changed

Lines changed: 17 additions & 8 deletions

File tree

vignettes/theory-crash-course.Rmd

Lines changed: 17 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,8 @@ these matrices are of dimension $n \times p$, where $n$ is the number of
5151
observations (e.g. study participants or measurement dates) and $p$ is the
5252
number of exposures (e.g. chemical and/or non-chemical stressors). Beyond this
5353
mixtures model, the main assumption made by PCP is that
54-
$Z_0 \sim N(\mu, \sigma^2)$ consists of i.i.d. Gaussian noise
54+
$Z_0 \sim N(\mu, \sigma^2)$ consists of independently and identically
55+
distributed (i.i.d.) Gaussian noise
5556
corrupting each entry of the overall exposure matrix $D$.
5657

5758
The models in `pcpr` seek to decompose an observed data matrix $D$ into estimated
@@ -145,7 +146,7 @@ regarding the quality of recovered missing observations:
145146
2. The fewer observations there are in $D$, the harder it is to accurately
146147
reconstruct $L$ (therefore estimation of _both_ unobserved _and_ observed
147148
measurements in $L$ degrades); and
148-
3. Greater proportions of missingness in $D$ artifically drive up the
149+
3. Greater proportions of missingness in $D$ artificially drive up the
149150
sparsity of the estimated $S$ matrix. This is because it is not possible
150151
to recover a sparse event in $S$ when the corresponding entry in $D$ is
151152
unobserved. By definition, sparse events in $S$ cannot be explained by
@@ -202,12 +203,16 @@ relative differences:
202203
| Supports missing values? | _Yes_ | _Yes_ |
203204
| Supports LOD penalty? | _Yes_ | _Yes_ |
204205
| Supports non-negativity constraint? | _Yes_ | _No_ |
205-
| Rank determination? | _Autonomous_ | _User-defined_ |
206+
| Rank determination? | _Autonomous_ | _User-defined_* |
206207
| Sparse event identification? | _Autonomous_ | _Autonomous_ |
207208
| Optimization approach? | _ADMM_ | _Iterative rank-based_ |
208209

210+
*`rrmc()` can be paired with the cross-validated `grid_search_cv()` function
211+
for autonomous rank determination.
212+
209213
Convex PCP via `root_pcp()` is best for data characterized
210-
by rapidly decaying singular values, indicative of very well-defined latent patterns.
214+
by rapidly decaying singular values (e.g. image and video data),
215+
indicative of very well-defined latent patterns.
211216

212217
Non-convex PCP with `rrmc()` is best suited for data characterized by slowly decaying singular values,
213218
indicative of complex underlying patterns and a relatively large degree of noise. Most EH data can be
@@ -228,7 +233,7 @@ Moreover, convex PCP approaches are best suited to instances in which the target
228233
low-rank matrix $L_0$ can be accurately modelled as low-rank (i.e. $L_0$ is
229234
governed by only a few very well-defined patterns). This is often the case with
230235
image and video data (characterized by rapidly decaying singular values), but
231-
not common for EH data. EH data is typically is only approximately low-rank
236+
not common for EH data. EH data is typically only approximately low-rank
232237
(characterized by complex patterns and slowly decaying singular values).
233238

234239
The convex model available in `pcpr` is `root_pcp()`. For a comprehensive
@@ -260,8 +265,12 @@ provide this flexibility by allowing the user to interrogate the data at
260265
different ranks.
261266

262267
The drawback here is that non-convex algorithms can no longer determine the rank
263-
best describing the data autonomously, instead requiring the researcher to
264-
subjectively specify the rank $r$ as in PCA. One of the more glaring trade-offs made
268+
best describing the data on their own, instead requiring the researcher to
269+
subjectively specify the rank $r$ as in PCA. However, by pairing non-convex PCP algorithms
270+
with the cross-validation routine implemented in the `grid_search_cv()` function,
271+
the optimal rank can be determined semi-autonomously; the researcher need only define
272+
a rank _search space_ from which the _optimal rank will be identified via grid search_.
273+
One of the more glaring trade-offs made
265274
by non-convex methods for this improved run-time and flexibility is weaker
266275
theoretical promises; specifically, non-convex PCP runs the risk of finding
267276
spurious _local_ optima, rather than the _global_ optimum guaranteed by their
@@ -383,7 +392,7 @@ $\xi$ is relatively low, e.g. $\xi = 0.05$, or 5%, and $K$ is relatively high, e
383392
set is obtained each run, providing balanced coverage of $D$. Viewed another way, the smaller $K$ is, the more
384393
the results are susceptible to overfitting to the relatively few selected test sets.
385394

386-
### Interpretaion of results
395+
### Interpretation of results
387396

388397
Once the grid search of has been conducted, the optimal hyperparameters can be chosen by examining the output
389398
statistics `summary_stats`. Below are a few suggestions for how to interpret the `summary_stats` table:

0 commit comments

Comments
 (0)