@@ -51,7 +51,8 @@ these matrices are of dimension $n \times p$, where $n$ is the number of
5151observations (e.g. study participants or measurement dates) and $p$ is the
5252number of exposures (e.g. chemical and/or non-chemical stressors). Beyond this
5353mixtures model, the main assumption made by PCP is that
54- $Z_0 \sim N(\mu, \sigma^2)$ consists of i.i.d. Gaussian noise
54+ $Z_0 \sim N(\mu, \sigma^2)$ consists of independently and identically
55+ distributed (i.i.d.) Gaussian noise
5556corrupting each entry of the overall exposure matrix $D$.
5657
5758The models in ` pcpr ` seek to decompose an observed data matrix $D$ into estimated
@@ -145,7 +146,7 @@ regarding the quality of recovered missing observations:
1451462 . The fewer observations there are in $D$, the harder it is to accurately
146147 reconstruct $L$ (therefore estimation of _ both_ unobserved _ and_ observed
147148 measurements in $L$ degrades); and
148- 3 . Greater proportions of missingness in $D$ artifically drive up the
149+ 3 . Greater proportions of missingness in $D$ artificially drive up the
149150 sparsity of the estimated $S$ matrix. This is because it is not possible
150151 to recover a sparse event in $S$ when the corresponding entry in $D$ is
151152 unobserved. By definition, sparse events in $S$ cannot be explained by
@@ -202,12 +203,16 @@ relative differences:
202203| Supports missing values? | _ Yes_ | _ Yes_ |
203204| Supports LOD penalty? | _ Yes_ | _ Yes_ |
204205| Supports non-negativity constraint? | _ Yes_ | _ No_ |
205- | Rank determination? | _ Autonomous_ | _ User-defined_ |
206+ | Rank determination? | _ Autonomous_ | _ User-defined_ * |
206207| Sparse event identification? | _ Autonomous_ | _ Autonomous_ |
207208| Optimization approach? | _ ADMM_ | _ Iterative rank-based_ |
208209
210+ * ` rrmc() ` can be paired with the cross-validated ` grid_search_cv() ` function
211+ for autonomous rank determination.
212+
209213Convex PCP via ` root_pcp() ` is best for data characterized
210- by rapidly decaying singular values, indicative of very well-defined latent patterns.
214+ by rapidly decaying singular values (e.g. image and video data),
215+ indicative of very well-defined latent patterns.
211216
212217Non-convex PCP with ` rrmc() ` is best suited for data characterized by slowly decaying singular values,
213218indicative of complex underlying patterns and a relatively large degree of noise. Most EH data can be
@@ -228,7 +233,7 @@ Moreover, convex PCP approaches are best suited to instances in which the target
228233low-rank matrix $L_0$ can be accurately modelled as low-rank (i.e. $L_0$ is
229234governed by only a few very well-defined patterns). This is often the case with
230235image and video data (characterized by rapidly decaying singular values), but
231- not common for EH data. EH data is typically is only approximately low-rank
236+ not common for EH data. EH data is typically only approximately low-rank
232237(characterized by complex patterns and slowly decaying singular values).
233238
234239The convex model available in ` pcpr ` is ` root_pcp() ` . For a comprehensive
@@ -260,8 +265,12 @@ provide this flexibility by allowing the user to interrogate the data at
260265different ranks.
261266
262267The drawback here is that non-convex algorithms can no longer determine the rank
263- best describing the data autonomously, instead requiring the researcher to
264- subjectively specify the rank $r$ as in PCA. One of the more glaring trade-offs made
268+ best describing the data on their own, instead requiring the researcher to
269+ subjectively specify the rank $r$ as in PCA. However, by pairing non-convex PCP algorithms
270+ with the cross-validation routine implemented in the ` grid_search_cv() ` function,
271+ the optimal rank can be determined semi-autonomously; the researcher need only define
272+ a rank _ search space_ from which the _ optimal rank will be identified via grid search_ .
273+ One of the more glaring trade-offs made
265274by non-convex methods for this improved run-time and flexibility is weaker
266275theoretical promises; specifically, non-convex PCP runs the risk of finding
267276spurious _ local_ optima, rather than the _ global_ optimum guaranteed by their
@@ -383,7 +392,7 @@ $\xi$ is relatively low, e.g. $\xi = 0.05$, or 5%, and $K$ is relatively high, e
383392 set is obtained each run, providing balanced coverage of $D$. Viewed another way, the smaller $K$ is, the more
384393 the results are susceptible to overfitting to the relatively few selected test sets.
385394
386- ### Interpretaion of results
395+ ### Interpretation of results
387396
388397Once the grid search of has been conducted, the optimal hyperparameters can be chosen by examining the output
389398statistics ` summary_stats ` . Below are a few suggestions for how to interpret the ` summary_stats ` table:
0 commit comments