Skip to content

Isofer/2309 cost per unit part1#2320

Draft
isofer wants to merge 19 commits intoisofer/roas-the-restfrom
isofer/2309-cost-per-unit-part1
Draft

Isofer/2309 cost per unit part1#2320
isofer wants to merge 19 commits intoisofer/roas-the-restfrom
isofer/2309-cost-per-unit-part1

Conversation

@isofer
Copy link
Contributor

@isofer isofer commented Feb 23, 2026

Description

Goal: Support channels measured in non-monetary units (impressions, clicks) while keeping ROAS and budget optimization in spend terms.

This PR introduces cost_per_unit support for two independent purposes that share only the DataFrame parsing logic (_parse_cost_per_unit_df):

1. Historical cost-per-unit — for ROAS calculation

Uses the cost-per-unit from the training period to convert channel_data (in original units) into channel_spend (in monetary units). This precomputed spend is stored in idata.constant_data and used by get_channel_spend() for ROAS calculations in incrementality and plot_interactive.

  • cost_per_unit can be provided at init or post-fit via set_cost_per_unit().
  • After fit: channel_spend = channel_data × cost_per_unit is injected into idata.constant_data.
  • get_channel_spend() returns channel_spend when present, falls back to channel_data (backward compatible).
  • get_channel_data() always returns raw data in original units (needed by the model evaluator).
  • cost_per_unit property on the wrapper derives the rate on-the-fly as channel_spend / channel_data.

2. Future cost-per-unit — for budget optimization

Uses the cost-per-unit for the optimization window to convert monetary budgets into the model's native units before optimization. This is completely independent of the historical cost-per-unit above.

  • cost_per_unit parameter added to optimize_budget() and BudgetOptimizer.
  • Conversion pipeline: monetary budget → time distribution → divide by cost_per_unit[t] → channel scaling → model.
  • Each time period uses its own cost-per-unit rate (no averaging).
  • Output optimal_budgets remain in monetary units for user convenience.

Shared: DataFrame parsing

Both use cases share MMM._parse_cost_per_unit_df(), which converts a wide-format DataFrame (rows = date × custom_dims, columns = channels) into an xr.DataArray. Missing channels default to 1.0 (already in spend units).

Why we store channel_spend instead of cost_per_unit in idata?

  1. Our computations (ROAS, budget optimization, plots) need channel_spend, not the rate. There is no need to store cost_per_unit.
  2. aggregate_idata_time (used in incrementality and for interactive plots) sums every variable with a date dimension. If we stored the rate, we would get aggregated channel_data and aggregated cost_per_unit after time aggregation, but there is no way to derive total channel spend from those two. channel_spend is additive, so summing it over time yields the correct total spend.

Related Issue

Screenshots / Examples

# Historical cost_per_unit — for ROAS
mmm = MMM(
    ...,
    cost_per_unit=cost_per_unit_df,  # wide-format: date, *dims, channel columns
)
idata = mmm.fit(X, y)
# idata.constant_data.channel_spend = channel_data * cost_per_unit

# Or set post-fit
mmm.set_cost_per_unit(cost_per_unit_df, overwrite=False)

# Future cost_per_unit — for budget optimization
result = optimizer.optimize_budget(
    budget=100_000,
    cost_per_unit=future_cost_per_unit_df,  # for optimization window
)

Changes Made

  • _parse_cost_per_unit_df() (static, in multidimensional.py): Shared parser that converts a wide-format DataFrame into an xr.DataArray aligned to model coordinates. Used by both historical and optimizer paths.
  • mmm_wrapper.py: get_channel_spend() prefers channel_spend over channel_data; add get_channel_data() for raw units; add cost_per_unit property (derived on-the-fly).
  • schema.py: Add channel_spend VariableSchema to constant_data (optional).
  • multidimensional.py: cost_per_unit init param; fit-time injection of channel_spend; set_cost_per_unit(); serialize/deserialize _cost_per_unit_input; _parse_cost_per_unit_for_optimizer() on the budget wrapper; cost_per_unit param on optimize_budget().
  • budget_optimizer.py: Add cost_per_unit field and _validate_and_process_cost_per_unit(); convert budgets from monetary to native units in _replace_channel_data_by_optimization_variable (after time distribution, before channel scaling).
  • incrementality.py: Use get_channel_data() instead of get_channel_spend() for baseline_array in evaluator (baseline must be in original units, not spend).

Breaking Changes

  • None. get_channel_spend() falls back to channel_data when channel_spend is absent (backward compatible).

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@isofer isofer changed the base branch from main to isofer/roas-the-rest February 23, 2026 14:22
@isofer
Copy link
Contributor Author

isofer commented Feb 23, 2026

@cursoragent review

@github-actions github-actions bot added the MMM label Feb 23, 2026
@codecov
Copy link

codecov bot commented Feb 23, 2026

Codecov Report

❌ Patch coverage is 27.43363% with 82 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (isofer/roas-the-rest@6314ae3). Learn more about missing BASE report.

Files with missing lines Patch % Lines
pymc_marketing/mmm/multidimensional.py 20.25% 63 Missing ⚠️
pymc_marketing/mmm/budget_optimizer.py 36.84% 12 Missing ⚠️
pymc_marketing/data/idata/mmm_wrapper.py 46.15% 7 Missing ⚠️
Additional details and impacted files
@@                   Coverage Diff                   @@
##             isofer/roas-the-rest    #2320   +/-   ##
=======================================================
  Coverage                        ?   92.55%           
=======================================================
  Files                           ?       79           
  Lines                           ?    12703           
  Branches                        ?        0           
=======================================================
  Hits                            ?    11757           
  Misses                          ?      946           
  Partials                        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@pymc-labs pymc-labs deleted a comment from cursor bot Feb 23, 2026
@isofer
Copy link
Contributor Author

isofer commented Feb 23, 2026

@cursoragent review

@github-actions github-actions bot added the tests label Feb 23, 2026
@pymc-labs pymc-labs deleted a comment from cursor bot Feb 23, 2026
@isofer
Copy link
Contributor Author

isofer commented Feb 23, 2026

@cursoragent review

@pymc-labs pymc-labs deleted a comment from cursor bot Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant