Compare two prediction errors from RF in toy data by Fuhan-Yang · Pull Request #278 · CDCgov/cfa-vaccination-coverage-forecasting

Fuhan-Yang · 2026-03-12T19:10:35Z

Here is a EDA of the prediction errors from random forest using a toy data. The data is created in the way suited for linear regression. Then linear regression and random forest are used to fit the training data (80% of the total data) and do out-of-sample prediction. The prediction mean along with the prediction interval are plotted. The intervals were calculated in two ways for random forest: V_UIJ used by Wager et al and out-of-bag error by Lu et al., given number of trees varying within [100,1000, 5000, 10000]. We can see the prediction intervals are pretty consistent as the increase of trees. The OOB method creates wider interval than V_UIJ. Note that the width of the interval is the same across all the predictions for OOB, which assumes that the uncertainty is consistent across time. Applying this in vaccine coverage, this assumes that the prediction uncertainty of end-of-season coverage is the same as predicting the coverage one month after the forecast date. @swo

swo · 2026-03-12T21:02:34Z

This is still kind of a complicated example. See #279 for something very very simple.

I'm also still confused about how we'd use any method that does test/training split or in- vs. out-of-bag distinctions. We're interested in forecasting, so the target isn't in-bag or out-of-bag; it's just not in the dataset.

My emerging conclusion is to either (1) take the interval over trees or (2) switch to gradient boosting.

swo · 2026-03-23T15:36:52Z

We agreed to do an interval over trees' prediction

Fuhan-Yang added 2 commits March 12, 2026 13:53

compare two rf errors in toy data

420e4a6

add statsmodels

812d7e9

swo mentioned this pull request Mar 12, 2026

Confidence interval of random forest #276

Closed

swo closed this Mar 23, 2026

swo deleted the fy_rf_error branch March 23, 2026 15:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compare two prediction errors from RF in toy data #278

Compare two prediction errors from RF in toy data #278
Fuhan-Yang wants to merge 2 commits into
mainfrom
fy_rf_error

Fuhan-Yang commented Mar 12, 2026 •

edited

Loading

Uh oh!

swo commented Mar 12, 2026

Uh oh!

swo commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Fuhan-Yang commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

swo commented Mar 12, 2026

Uh oh!

swo commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Fuhan-Yang commented Mar 12, 2026 •

edited

Loading