Skip to content

Clarify multilevel argument #253

Open
@bwiernik

Description

@bwiernik

I find the multilevel argument name confusing, and there have been several issues from users lately that have expressed similar confusion.

Based on the name, I would expect a decomposition of the correlation matrix into between-groups and within-groups components, similar to psych::statsBy(). The between-groups component is correlations among group means, the within-groups component is the pooled within-group correlation matrix (computed as the correlations among group-mean-centered variables). This is what is typically meant in my experience (at least in psychology circles) by phrases like "multilevel factor analysis", "multilevel SEM", or "multilevel correlations".

The multilevel argument computes what is effectively the within-groups component described above, but estimated using random effects (random intercepts for group) rather than fixed effects (group-mean-centering or including groups as dummy-coded variables). Both fixed and random specifications of this adjustment are "multilevel" in the sense that they are estimating average within-group correlations, but we currently do not report the between component of the multilevel correlations in either specification.

I think it would be clearer for the argument to be named something like random_factors. This would make it clearer to me that what this argument is switching is how factors are partialed out.

Estimating correct point estimates/df/p/CIs for both within-group and between-group correlations is easy for fixed factor controls (known analytic solutions).

For random factor controls, we can get reasonable point estimates/df/p/CIs for within-correlation using our current estimation approach and some choice of profile likelihood or DoF approximation, or we can be close enough I'd argue by just using the fixed effects df. For between-correlations, we can either (1) pivot to a long format and fit a model with 0 + name + (0 + name | id) and get the correlation from there, then use profile likelihood for the CI, or (2) use our current estimation approach, estimate random effects for persons, and then compute the correlations among those post-hoc, using the fixed effects df. The second option there is probably close enough.

Metadata

Metadata

Assignees

No one assigned

    Labels

    docs 📚Something to be adressed in docs and/or vignettesenhancement 💥Implemented features can be improved or revised

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions