Skip to content

feat(mmm): OptimizableMuEffect protocol + DiscountedEventEffect (discount-depth lever)#2621

Draft
PabloRoque wants to merge 3 commits into
pymc-labs:v1.0.0from
PabloRoque:optimizable-mueffect
Draft

feat(mmm): OptimizableMuEffect protocol + DiscountedEventEffect (discount-depth lever)#2621
PabloRoque wants to merge 3 commits into
pymc-labs:v1.0.0from
PabloRoque:optimizable-mueffect

Conversation

@PabloRoque

@PabloRoque PabloRoque commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds two new public classes and a worked example notebook for optimising promotional discount depth jointly with paid media budgets.

OptimizableMuEffect (abstract)

A new protocol layer on top of MuEffect for spend-driven effects that participate in BudgetOptimizer's joint allocation. Subclasses implement:

  • replace_for_optimization(budget_slice, num_periods, budget_distribution) — return pm.do() replacements converting monetary spend to model variables.
  • set_budget_for_sampling(budget_per_item, model) — apply the optimised budget before posterior predictive sampling.
  • budget_channel_names — ordered list of channel/event names matching the budget_dim coordinate.

The optimizer discovers OptimizableMuEffect instances at construction time, extends the flat budget variable with their channels, and routes per-channel allocations back via set_budget_for_sampling.

DiscountedEventEffect

Concrete OptimizableMuEffect for promotional events (Black Friday, summer sale, …) where the lever is discount depth. Uses the full revenue-retention formula:

$$lift_k = beta_k * ln(1 + d_k) * (1 - d_k) - d_k * r_k$$
  • ln(1+d) * (1-d): hump-shaped volume × price-retention term.
  • -d_k * r_k: margin cost — fraction d_k of the baseline per-period revenue r_k is forgone.
  • r_k = event_revenue / n_periods, normalised by target_scale, auto-computed from the observed y. Registered as a pmd.Data node.
  • lift(0) = 0, lift(1) = -r_k (100% discount → zero net revenue per period).

Key design decisions:

  • event_revenue and revenue_per_period are always internal — no user column required.
  • discount_min / discount_max set budget bounds at constructor time.
  • The Model protocol is not polluted with MMM-specific attributes (scalers accessed via getattr).

BudgetOptimizer / BudgetOptimizerWrapper integration

  • budget_optimizer.py: extend _budgets_flat with effect channels; call replace_for_optimization for each effect; route allocations back in sample_response_distribution.
  • mmm.py: 3-phase sample_response_distribution (clone → set media data → set effect data) to prevent NaN expansion in channel_contribution.
  • constraints.py: fix budgets_flat slice to use full combined length.
  • utils.py: relax create_zero_dataset validation for effect-only optimisation windows.
  • __init__.py: export MuEffect, OptimizableMuEffect, DiscountedEventEffect.

Tests

51 tests in tests/mmm/test_additive_effect.py covering:

  • OptimizableMuEffect ABC contract
  • DiscountedEventEffect model variables, forward formula, per-event revenue computation
  • replace_for_optimization / set_budget_for_sampling correctness
  • BudgetOptimizer flat-variable sizing and end-to-end allocate_budget
  • sample_response_distribution with mixed media + event channels (no NaN)
  • Serialisation roundtrip

Notebook

docs/source/notebooks/mmm/mmm_discounted_events.ipynb — full worked example: synthetic DGP → fit → posterior lift curves → joint budget optimisation → discount prescription table → response distribution comparison.

Checklist

  • All 51 tests pass
  • ruff check, ruff format, mypy pass
  • pre-commit hooks pass on all changed files
  • Notebook validates (nbformat)
  • Gallery entry added

Introduces two new classes:

- OptimizableMuEffect: abstract base for spend-driven mu effects that
  participate in BudgetOptimizer's joint allocation problem.  Subclasses
  implement replace_for_optimization(), set_budget_for_sampling(), and
  budget_channel_names so the optimizer can discover and route budgets
  to arbitrary effect types.

- DiscountedEventEffect: concrete implementation for promotional events
  (e.g. Black Friday, summer sale) where the lever is discount depth.
  Uses the full revenue-retention formula:

    lift_k = beta_k * ln(1 + d_k) * (1 - d_k) - d_k * r_k

  where r_k = event_revenue_k / n_periods_k (average baseline revenue per
  in-sample period, normalised by target_scale).  This makes lift(0) = 0
  and lift(1) = -r_k (100% discount drives total revenue to zero).

  - event_revenue and revenue_per_period are auto-computed from y
  - r_k is registered as a pmd.Data node in the normalized model scale
  - discount_min / discount_max bound the optimizer's budget range
  - budget_bounds returns [(d_min * rev, d_max * rev)] per event

51 tests added covering the full contract: OptimizableMuEffect ABC,
DiscountedEventEffect model variables, replace_for_optimization,
set_budget_for_sampling, BudgetOptimizer integration end-to-end, and
serialization roundtrips.
- BudgetOptimizer discovers OptimizableMuEffect instances on the model
  at construction time and extends _budgets_flat with their channels so
  all spend types share one flat variable and the total-budget constraint
  is meaningful across the full portfolio.

- replace_optimizable_mueffects() is called once during __init__; each
  effect's replace_for_optimization() returns {var_name: tensor} dicts
  that are merged and passed to a single pm.do() call.

- BudgetOptimizerWrapper.sample_response_distribution() routes per-channel
  budgets back to the originating effect via set_budget_for_sampling()
  before calling sample_posterior_predictive, and keeps event channels
  out of the media channel_contribution coordinate so no NaN expansion
  occurs.

- MMM.sample_response_distribution uses a 3-phase approach:
  (1) clone model, (2) set media data, (3) set effect data; avoids the
  NaN issue that arose when effect channels appeared in the channel dim.

- constraints.py: fix budgets_flat slice to use the full combined length.
- utils.py: relax create_zero_dataset validation for effect-only windows.
- __init__.py: export MuEffect, OptimizableMuEffect, DiscountedEventEffect.
Adds mmm_discounted_events.ipynb demonstrating DiscountedEventEffect
end-to-end:

- Synthetic DGP with three promotional event windows; ground-truth lift
  follows beta*ln(1+d)*(1-d) - d*r_k (full revenue-retention formula).
- MMM build/fit with DiscountedEventEffect attached as a mu_effect.
- Posterior lift curves per event with 94 % HDI bands, including the
  -d*r_k baseline-cost term so curves can go negative at high discount.
- Joint budget optimization across paid media + promotional events.
- Discount prescription table snapped to 5 % grid steps.
- sample_response_distribution comparison (current vs optimal).

gallery.yaml / gallery.md: add entry under Budget Allocation section.
@review-notebook-app

Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@github-actions github-actions Bot added docs Improvements or additions to documentation MMM tests labels Jun 9, 2026
@codecov

codecov Bot commented Jun 9, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 86.73469% with 26 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.96%. Comparing base (e890c81) to head (0f7e489).

Files with missing lines Patch % Lines
pymc_marketing/mmm/additive_effect.py 83.58% 22 Missing ⚠️
pymc_marketing/mmm/mmm.py 80.00% 3 Missing ⚠️
pymc_marketing/mmm/budget_optimizer.py 97.72% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           v1.0.0    #2621      +/-   ##
==========================================
- Coverage   94.04%   93.96%   -0.08%     
==========================================
  Files          97       97              
  Lines       14532    14714     +182     
==========================================
+ Hits        13666    13826     +160     
- Misses        866      888      +22     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@daimon-pymclabs

Copy link
Copy Markdown
Contributor

Detailed review — OptimizableMuEffect + DiscountedEventEffect

Reviewed the full diff (excluding the notebook): additive_effect.py, budget_optimizer.py, mmm.py, constraints.py, utils.py, __init__.py, and the test module. Overall this is a clean, well-documented extension. The abstract/concrete split is sensible, the single-do() design to preserve the gradient path is the right call, and the docstrings are excellent. Below are the things worth resolving before merge, ordered by importance.

Design question (the crux — please confirm intent)

The discount cost appears to be represented twice. Each event's "budget" is defined as discount_pct × event_revenue (see replace_for_optimization and budget_bounds), and that figure is summed together with real media spend in the equality constraint sum(budgets_flat) == total_budget. Separately, the forward model already subtracts the margin forgone via the -d·r_k term in create_effect, and _compile_objective_and_grad adds that contribution into the maximised response.

So the revenue given up on a discount is charged once as budget consumed (it competes with media for the fixed pot) and once again as a reduction in predicted revenue. Depending on what you intend, that may be correct (two distinct economic effects: an allocation lever and a P&L impact) or a double-count. Worth a sentence in the class docstring making the intended semantics explicit, because a reader will trip on it. Relatedly: media total_budget is real cash out, whereas d·event_revenue is notional forgone revenue — treating €1 of each as fungible in one equality constraint is a strong modelling assumption that deserves a callout in the docs/notebook.

Important

  1. Custom constraints silently change meaning (constraints.py). compile_constraints_for_scipy now passes budgets_flat (1-D, dims=("budgets_flat",)) where it previously passed _budgets (the reshaped, budget_dims-shaped tensor). The default sum constraint is fine — and is in fact now correctly portfolio-wide — but any user-defined constraint that relied on the dimensional structure of budgets (e.g. selecting a named channel sub-group) will break or silently misbehave. This is a behavioural change to the public constraint API. Please call it out in the changelog/PR description, and ideally keep the optimizer handle as the documented way to reach _budgets for dimensional constraints.

  2. Effect-channel append assumes a 1-D channel budget (budget_optimizer.allocate_budget). xr.concat([optimal_budgets, effect_da], dim="channel") builds effect_da with only a channel dim, while optimal_budgets carries self._budget_dims. For a multi-dimensional budget (e.g. channel × geo) this concat will either fail or broadcast incorrectly. If multi-dim budgets aren't supported with optimizable effects yet, a guard with a clear error beats a confusing xarray exception.

  3. channel_type dict stored in attrs. optimal_budgets.attrs["channel_type"] is a dict. If anything downstream serialises this DataArray to netCDF/Zarr it will fail (attrs must be scalars/arrays/strings). Fine if it never round-trips through disk, but worth confirming or storing as a coordinate/JSON string instead.

Minor / nits

  • reference_date is undocumented. It's a constructor field (default "2025-01-01") but missing from the Parameters section of the DiscountedEventEffect docstring. Also worth a word on why the default is fine even when events predate it (negative offsets are harmless).
  • Effect allocations aren't echoed in the response dataset. sample_response_distribution now restricts allocation/total_allocation to self.channel_columns (correct — it fixes the NaN expansion). Side effect: the optimised discount budgets don't appear anywhere in the returned dataset's allocation. Intentional? A caller inspecting the response can't see what discount was applied without the original allocation_strategy.
  • Dead/renamed name in tests. test_missing_replace_for_optimization_raises defines a subclass with optimizable_channel_names, which isn't part of the contract (the property is budget_channel_names). Harmless leftover, but confusing.
  • Required-columns test is thin. test_missing_required_columns_raises only exercises the missing-start_date path; the discount_min > discount_max and out-of-range discount_pct validation branches in model_post_init aren't covered.
  • idata round-trip of df_events. idata_groups uses xr.Dataset.from_dataframe(...) and the deserializer does to_dataframe().reset_index(). start_date/end_date are strings → object dtype in xarray; worth a serialisation round-trip test to confirm dtypes (and the discount_pct column) survive intact.
  • getattr(effect, "budget_bounds", None) only falls back on AttributeError. Since budget_bounds is always defined on DiscountedEventEffect and raises RuntimeError when _event_revenue is None, the fallback never triggers for it — which is fine, but the intent (per-effect override else default) reads more clearly with an explicit hasattr/try or a base-class budget_bounds returning None.

Things done well

  • Single combined do() call with the documented rationale (avoiding a second fgraph_from_model traversal that would clone _budgets_flat) — exactly the kind of subtle gradient-severing bug that's easy to introduce; good that it's both fixed and commented.
  • Explicitly augmenting response_distribution with extract_response_distribution(contribution_var_name) rather than the raw model lookup, so the objective sees the posterior rather than test values — correct and well-explained.
  • The getattr(mmm, "scalers", ...) approach to avoid polluting the Model protocol is the right trade-off.
  • Zero-revenue guards in both replace_for_optimization (switch) and set_budget_for_sampling (np.where) — good symmetry between the symbolic and concrete paths.

Nothing here is a hard blocker except confirming the design question and the constraint-API behavioural change; the rest are polish. Nice work.

— review requested via Daimon

@juanitorduz

Copy link
Copy Markdown
Collaborator

@PabloRoque , some initial thoughts from Daimon (I have found these reviews complementary to human ones ;) . You can re-trigger this again whenever you want more feedback :)

@cetagostini

Copy link
Copy Markdown
Contributor

I did not fully review but, do we want to make this part of optimozer? It feels, we could keep optimizer agnostic, and add this like helpers.

Plus, if Daimon it's right, I'm a bit worry about behavior changes e.g: constraints 🧐

@williambdean

Copy link
Copy Markdown
Contributor

Just flagging
#2425
in order to keep these changes future proof

@cetagostini

cetagostini commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Note: this comment was drafted with AI assistance (Fable 5) after a deep review of the branch, #2570, and #2425.

Following up on my earlier comment after a proper deep dive. First off: the idea here is incredible and genuinely valuable — joint optimization of discount depth with media budgets is something some users ask for, the lift formula is well thought out, the single-do() gradient handling is subtle work done right, and the notebook/docs quality is above our average. None of the feedback below is about the what, only about the where.

My concern is that the implementation moves in the opposite direction of what we've been building toward. @williambdean's design in #2425BudgetOptimizer(model, idata) — makes the optimizer a pure graph-level tool: decision variables, objective, constraints, bounds, and nothing else. This PR does the inverse: it imports OptimizableMuEffect into the optimizer, introspects mmm_model.mu_effects at construction, bakes shared-pot semantics into _budgets_flat, and changes the constraint_fun contract globally (every custom constraint now receives the flat vector instead of the dims-shaped budgets — a breaking change for all users, including ones not touching this feature, and it collides with #2570). I fully support William's direction here: the optimizer should only ever learn new graph capabilities, never new marketing concepts. A good litmus test going forward: if a feature requires budget_optimizer.py to import from additive_effect.py or mmm.py, it's at the wrong layer.

The good news is that I think we can keep all of the value with none of the optimizer surgery:

  1. encode the economics in the model and let the existing optimizer work untouched. The promo lever enters as a channel_data column (spend = d × event_revenue, pinned to the event window via budget_distribution_over_period), the lift curve becomes a custom SaturationTransformation, and MaskedPrior on the adstock gives the promo channel identity dynamics — exactly the composition patterns we already teach in the GAM options notebook.
  2. With that pivot the contribution flows into channel_contribution/total_media_contribution_original_scale natively, so the objective augmentation, the sample_response_distribution patches, the xr.concat of effect channels, and the constraints change all become unnecessary by construction — and shared-pot vs. separate-pot stops being library policy and becomes an explicit user constraint.

So my proposal: keep DiscountedEventEffect's math and the excellent notebook narrative, but reshape this as a modeling recipe (and revisit a generic optimizer hook only after #2570/#2425 land, if a lever shows up that genuinely can't be expressed in the model graph). Happy to pair on the rework — the hard parts (the math, the notebook, most tests) carry over almost intact.

@ricardoV94

Copy link
Copy Markdown
Contributor

Took a look.

Let me know if I missed the point. My understanding is we want to optimize other data variables besides media spend (channel data), and you propose we do this via special MuEffects. First question... what about optimizing control variables and other inputs that show elsewhere in the graph? First step seems to be just a way to decide which data-containers / deterministic variables are picked as inputs of the optimization.

Then once we add more optimization inputs we need to decide how they affect the cost. This PR suggests converting these inputs into "budget" so they are on a common denomination and handled by the default cost. But as mentioned by the Bot, in your example you ended up double counting it, because it already affected the predicted revenue / response variable.

(Aside re: response variable. This PR does response_distribution + optimized mu effects. Fine for default model, wrong for log MMM or non-default response_distribution? IMO the optimizer shouldn't try to craft a custom objective like that, it should be a well defined quantity in the original model).

The custom response variable is where the double count materializes, but I don't think it's the issue. The discount should have zero cost (or if you want whatever running the discount campaign costs) if you are modeling revenue (as opposed to #sales or some non monetary quantity), and the optimized variable should be a [0, 1] discount lever.

More obvious free lever: email frequency, bounded by fatigue not money. Would be nice to optimize but it would probably be in a MuEffect that has an inflection point, and needs no artificial link to budget?

Idea: decouple the lever from its cost. Levers enter the optimizer in native units with their own bounds. A "cost" is then just a symbolic expression of the levers (possibly referencing model variables) that you place in whatever constraint the accounting says: the default budget constraint, another one, or none at all. The default optimizer is the special case lever = media spend, cost = identity, placed in the default budget sum constraint. The discount declares no cost term anywhere — it's already in the response if you model revenue. An extra promo day puts its cash cost in the budget constraint. A #sales model that wants to cap giveaways would constrain a custom response variable.

Other notes:

  • I think future events can't be optimized? event_revenue is computed from observed y over the training dates, so an event outside the training window gets revenue 0 and optimized deterministically to 0?
  • My personal bot finding: sample_response_distribution mutates the fitted model. set_budget_for_sampling does pm.set_data({"promo_discount_pct": ...}) and nothing restores it
  • Agree with the other bot finding on the constraint_fun contract change. (My proposal also requires changes btw)
  • The optimized variable -> effect conversion is duplicated between replace_for_optimization and set_budget_for_sampling. Can we reuse the symbolic one and call eval?
    • Note: My proposal doesn't need this at all. The optimizer is in terms of native units which regular set_data should handle. Any links to cost (through objective or constraints) needs no custom representation in the predictive graph.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Improvements or additions to documentation MMM tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants