Auto-correlation between samples (Binkley et al.)

I recently found [this paper](http://www.cs.loyola.edu/~binkley/papers/icpc14-lda.pdf) by Binkley et al.

A short extract from this paper follows:

- b – the number of burn-in iterations
- n – the number of samples (random variates)
- si – the sampling interval

> If si is large enough, the observations are practically independent. However, too small a value risks unwanted correlation. To summarize the effect of b, n, and si: if any of these settings are too low, then the Gibbs sampler will produce inaccurate or inadequate information; if any of these settings are too high, then the only penalty is wasted computational effort.
Unfortunately, as described in Section 6, support for extracting
interval-separated observations is limited in existing LDA tools. For example,
For example, Mallet provides this capability but appears to suffer from a local maxima problem

with a footnote linking to http://www.cs.loyola.edu/~binkley/topic_models/additional-images/mallet-fixation/

Does this problem still exist?

Reference: 
Binkley, D., Heinz, D., Lawrie, D., & Overfelt, J. (2014). Understanding LDA in source code analysis. 22nd International Conference on Program Comprehension, ICPC 2014 - Proceedings, 26–36. https://doi.org/10.1145/2597008.2597150


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Auto-correlation between samples (Binkley et al.) #201

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Auto-correlation between samples (Binkley et al.) #201

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions