Holdout Sample #1021

rajeshpaleti999 · 2025-09-10T12:27:57Z

rajeshpaleti999
Sep 10, 2025

I ran the model under two scenarios of holdout samples:

Treating MMM input data as time series and using a time-based split for defining training and holdout samples (e.g., everything beyond a cut-off date is holdout).
Treating MMM input data as cross-sectional and using a random split across all time periods

The goodness-of-fit metrics, particularly, R-squared was particularly worse in the Option 1 (i.e., time series split) whereas the performance was comparable between training and holdout samples in Option 2 (i.e., random split). In both cases, I am using 80-20 split between training and holdout samples.

I remember reading that Meridian recommends random split (option 2). Is this correct? If so, how to reconcile with the practice in time-series models to not peek into the future while validating/testing models?

dattatreyam23 · 2025-09-12T04:21:12Z

dattatreyam23
Sep 12, 2025

Hi @rajeshpaleti999,

Thank you for contacting us!

I would like to inform you that Meridian is based on Bayesian Causal Inference, whose primary intent is the accurate estimation of causal marketing effects. It is not intended to be used for future predictions as a forecasting model. A high R-squared doesn't guarantee a good causal inference model, and the best predictive model might not be the best for causal inference. Hence, a model with 99% out-of-sample R-squared can still be a poor model for causal inference.

We recommend a holdout sample balanced across geos and time periods, with similar observations for each. An imbalanced sample can lead to insufficient training data for geo or time effects. Meridian doesn't specify a holdout sample by default; you must define one and ensure its balance.

When you hold out the most recent data, as in Situation 1 it becomes a difficult task, especially if there are shifts in trends or seasonality in the holdout period that were not present in the training data. The lower R-squared might reflect this difficulty.
A time-based split, where you hold out a contiguous block of time at the end of your data can cause problems for a causal inference model like Meridian. The model uses a technique with knots to model trend and seasonality. If a large chunk of time is held out, there is no data near the "knots" in that period. In this case, the knot's posterior distribution is driven by the prior, which can result in poor forecasting.

In short, the two methods are testing different things. The time-series split tests forecasting ability, while the random split tests the model's ability to generalize to unseen data from the same time period. Given that Meridian's primary purpose is causal inference, the random split is the more appropriate method for evaluating the model's fit.

I hope this helps. Feel free to reach out if you have any questions or suggestions regarding Meridian.

Thank you,

Google Meridian Support Team

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Holdout Sample #1021

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

Holdout Sample #1021

Uh oh!

rajeshpaleti999 Sep 10, 2025

Replies: 1 comment

Uh oh!

dattatreyam23 Sep 12, 2025

rajeshpaleti999
Sep 10, 2025

dattatreyam23
Sep 12, 2025