-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathCBCV-Subscription-Based.qmd
288 lines (234 loc) · 13.6 KB
/
CBCV-Subscription-Based.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
---
title: Customer-Based Corporate Valuation for Subscription-Based Businesses
author: Abdullah Mahmood
date: last-modified
format:
html:
theme: cosmo
css: quarto-style/style.css
highlight-style: atom-one
mainfont: Palatino
fontcolor: black
monobackgroundcolor: white
monofont: Menlo, Lucida Console, Liberation Mono, DejaVu Sans Mono, Bitstream Vera Sans Mono, Courier New, monospace
fontsize: 13pt
linestretch: 1.4
number-sections: true
number-depth: 5
toc: true
toc-location: right
toc-depth: 5
code-fold: true
code-copy: true
cap-location: bottom
format-links: false
embed-resources: true
anchor-sections: true
code-links:
- text: GitHub Repo
icon: github
href: https://github.com/abdullahau/customer-analytics/
- text: Quarto Markdown
icon: file-code
href: https://github.com/abdullahau/customer-analytics/blob/main/buyer-behavior-summary-transaction-log.qmd
html-math-method:
method: mathjax
url: https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js
---
### Source
This model presented in this notebook is a direct implementation of *Daniel M. McCarthy, Peter S. Fader and Bruce G. S. Hardie*'s 2016 paper [*Valuing Subscription-Based Businesses Using Publicly Disclosed Customer Data*](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2701093).
I will also implement the models proposed by *Gupta, Sunil, Donald R. Lehmann, and Jennifer Ames Stuart* in their 2004 paper [*Valuing Customers*](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=459595), and *Schulze, Christian, Bernd Skiera, and Thorsten Wiesel* in their 2012 paper [*Linking Customer and Financial Metrics to Shareholder Value: The Leverage Effect in Customer-Based Valuation*](https://journals.sagepub.com/doi/abs/10.1509/jm.10.0280) as benchmarks.
### Imports
```{python}
import pandas as pd
import matplotlib.pyplot as plt
%config InlineBackend.figure_formats = ['svg']
```
## Introduction
The subscription-based business model—where customers pay recurring fees for ongoing access to a product or service—has seen significant growth beyond its traditional domains of media and telecom. Today, it underpins a wide range of consumer and B2B offerings, including SaaS (e.g., Microsoft 365), meal kits (e.g., Blue Apron), and consumer goods (e.g., Dollar Shave Club). A key driver of this model’s appeal is the predictability of revenue streams.
As subscription businesses scale, they increasingly disclose customer-related metrics such as **churn**, **acquisition cost** (CAC), **average revenue per user** (ARPU), and **customer lifetime value** (CLV). These metrics are critical for understanding a firm’s future cash flows—especially since, in subscription settings, customer behavior largely drives economic value. Equity analysts and investors now routinely incorporate this data into valuation models, with subscriber base growth often serving as a key signal.
In this notebook, I implement the framework (and its models) from [McCarthy, Fader, and Bruce (2016)](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2701093), which estimates firm value using only publicly disclosed customer data. Central to this framework are models of the firm's **acquisition** and **retention processes**, which accommodate factors such as *customer heterogeneity*, *duration dependence*, *seasonality*, and *changes in population size*.
We explicitly account for the fact that publicly reported data are typically *aggregated* (temporally and across customers) and suffer from *missingness* (i.e., the reported data are not available for all periods).
I begin by reviewing the principles of customer-based corporate valuation and the nature of the data available from subscription firms. I then introduce models for acquisition, retention, and spending behavior, and estimate their parameters using historical data from two publicly traded firms: Dish Network (DISH) and Sirius XM (SIRI).
Once validated, the model is used to generate forward-looking estimates of customer behavior, firm-level revenue, and ultimately, equity value. I also highlight additional strategic insights that emerge from the estimated parameters and model outputs.
## The Logic of Customer-Based Corporate Valuation
### Valuation 101
The framework summarized here is a discounted cash flow (DCF) model, which is the de-facto industry standard way in which operating assets are valued within the financial community. At the heart of any such valuation exercise is the estimation of period-by-period FCF, central to which are estimates of period-by-period revenue.
Denoting the value of the firm at time $T$ by $\text{SHV}_T$ (for shareholder value), we have
$$
\text{SHV}_T = \text{OA} + \text{NOA}_T - \text{ND}_T
$$
The value of a firm’s operating assets ($\text{OA}$) is equal to the sum of all *expected* future free cash flows (FCFs) the firm will generate, discounted at the weighted average cost of capital ($\text{WACC}$):
$$
\text{OA}_T = \sum_{t=0}^{\infty} \frac{\text{FCF}_{T+t}}{(1+\text{WACC})^{t}}
$$
FCF is equal to the net operating profit after taxes ($\text{NOPAT}$) minus the difference between capital expenditures ($\text{CAPEX}$) and depreciation and amortization ($\text{D\&A}$), minus the change in non-financial working capital ($\Delta\text{NFWC}$):
$$
\text{FCF}_t = \text{NOPAT}_t - (\text{CAPEX}_t - \text{D\&A}_t) - \Delta\text{NFWC}_t
$$
The most important ingredient of FCF is $\text{NOPAT}$, which is a measure of the underlying profitability of the operating assets of the firm. $\text{NOPAT}$ is equal to revenues ($\text{REV}$) times the contribution margin ratio ($1 - \text{VC}$) minus fixed operating costs ($\text{FC}$), after taxes (where $\text{TR}$ is the corporate tax rate for the firm):
$$
\text{NOPAT}_t = \left(\text{REV}_t \times (1 - \text{VC}) - \text{FC}_t \right) \times \left(1- \text{TR}_t\right)
$$
### A Data Structure for Subscription-Based Businesses
{fig-align="center" width="350"}
- $R(m)$ -> Monthly revenue
- $m = 1$ -> first month of commercial operations
- $C(m, m’)$ -> number of customers acquired by the firm in month $m$ who are still active in month $m’$
- $A(m) = C(m, m’)$ -> number of customers acquired each month by the firm
- $L(m)$ -> number of customers “lost” each month
- $L(m)$ = 0 if m = 0, else $C(., m-1) - [C(., m) - C(m,m)]$
- eg. $L(3) = C(., 2) - [C(., 3) - C(3,3)]$
$$
L(m) =
\begin{cases}
0 &\quad m =1 \\
C(., m-1) - [C(., m) - C(m,m)]&\quad m = 2,3,4, \ldots
\end{cases}
$$
- *Aggregate* churn rate = $L(m)/C(.,m-1)$
- $ARPU(m)$ -> Average revenue per subscriber in a given month
- $ARPU(m) = R(m) \big/ \frac{C(.,m-1) + C(.,m)}{2}$
- (The denominator is the average number of customers the firm has during month $m$.)
- $q$ quarter is a temporal aggregation of the true underlying process.
- $q = 1$ is equivalent to $m=\{1,2,3\}$.
- $\text{END}_q = C(., 3q)$
- $\text{ADD}_q = A(3q-2) + A(3q-1) + A(3q)$
- $\text{LOSS}_q = L(3q-2) + L(3q-1) + L(3q)$
- $REV_q = R(3q-2) + R(3q-1) + R(3q)$
- Objective: generate long-run projections of $R(m)$ and $A(M)$ using parameters estimated from $\text{ADD}$, $\text{LOSS}$, $\text{END}$ and $\text{REV}$ numbers
- Account for **missing data**:
- **Left-censoring** - Disclosure records of paying customers does not start from the beginning of commercial operations.
- Some measures are reported for some period of time (e.g., $\text{END}$) before being complemented by other measures (e.g., $\text{ADD}$ and $\text{LOSS}$).
- Account for **aggregation** (quarterly reports to monthly figures)
- Publicly disclosed customer data is typically reported quarterly with the associated unit of time being the quarter, whereas the firm is operating on a finer time interval (which, for the purposes of our analysis, we assume to be the month).
- [US Census Bureau’s Housing Vacancies and Homeownership CPS/HVS](https://www.census.gov/housing/hvs/data/histtabs.html)
- [Quarterly Estimates of the Housing Inventory](https://www.census.gov/housing/hvs/data/histtab8.xlsx):
Alternative Source: FRED (EOCCUSQ176N) Housing Inventory Estimate: Occupied Housing Units in the United States
- **Units**: Thousands of Units, Not Seasonally Adjusted
- **Frequency**: Quarterly
- **Notes**: A housing unit is occupied if a person or group of persons is living in it at the time of the interview or if the occupants are only temporarily absent, as for example, on vacation. The persons living in the unit must consider it their usual place of residence or have no usual place of residence elsewhere. The count of occupied housing units is the same as the count of households.
```{python}
file_path = "data/quarterly-estimates-housing-inventory-US.csv"
household_full = pd.read_csv(file_path)
household_full
```
```{python}
file_path = "data/revised-quarterly-estimates-housing-inventory-US.csv"
household_revised = pd.read_csv(file_path)
household_revised
```
```{python}
household_full_lf = (
pd.wide_to_long(household_full, stubnames="Q", i="Year", j="Quarter")
.sort_index()
.rename(columns={"Q": "Total Household"})
)
household_rev_lf = (
pd.wide_to_long(household_revised, stubnames="Q", i="Year", j="Quarter")
.sort_index()
.rename(columns={"Q": "Total Household"})
)
household_rev_lf.plot(figsize=(9, 4), linewidth=0.75, color="k");
```
```{python}
dish_op_res = pd.read_csv(
'data/DISH - ADD-LOSS-END.csv',
index_col='Date',
parse_dates=True,
dayfirst=True
).sort_index()
dish_op_res.tail()
```
- [U.S. Recession Bars](https://fredhelp.stlouisfed.org/fred/data/understanding-the-data/recession-bars/)
- [Business Cycle Dating](https://www.nber.org/research/business-cycle-dating)
```{python}
# Gross Customer Acquisition Every Quarter
dish_ADD = dish_op_res['Combined Additions'] * 1000
# Surviving Customers at the End of Quarter
dish_END = dish_op_res['Pay-TV Subscribers'] * 1000
# Customer Churn Every Quarter
dish_LOSS = (
dish_op_res["Pay-TV Subscribers"].shift(periods=1)
+ dish_op_res["Combined Additions"]
- dish_op_res["Pay-TV Subscribers"]
) * 1000
```
```{python}
def dish_plot_format(ax, data):
# Recession bar
ax.axvspan(pd.to_datetime("12/1/2007"), pd.to_datetime("06/1/2009"), color='gray', alpha=0.3)
# Padding
pad = pd.DateOffset(months=12)
ax.set_xlim(data.index.min() - pad, data.index.max() + pad)
# x-ticks
ax.set_xticks(data.index[::8])
ax.set_xticklabels([f"Q{((d.month-1)//3)+1} {d.year}" for d in data.index[::8]], rotation=45)
plt.tight_layout()
plt.show()
return
```
```{python}
# Gross Customer Acquisition Every Quarter
_, ax = plt.subplots(figsize=(9, 4))
dish_ADD.plot(ax=ax, color='k', linewidth=0.75, title='ADD', ylim=(-70, 1200))
dish_plot_format(ax, dish_ADD)
```
```{python}
# Customer Churn Every Quarter
_, ax = plt.subplots(figsize=(9, 4))
dish_LOSS.plot(ax=ax, color="k", linewidth=0.75, title="LOSS", ylim=(-70, 1000))
dish_plot_format(ax, dish_LOSS)
```
```{python}
# Surviving Customers at the End of Quarter
_, ax = plt.subplots(figsize=(9, 4))
dish_END.plot(ax=ax, color='k', linewidth=0.75, title='END', ylim=(-800,16000))
dish_plot_format(ax, dish_END)
```
Number of vehicles on the road data:
- [Highway Statistics - State motor-vehicle registrations - MV-1](https://www.fhwa.dot.gov/policyinformation/statistics.cfm)
- [Automobile Profile - Bureau of Transportation Statistics](https://www.bts.gov/content/automobile-profile)
Data description:
- **Units**: Number of Units
- **Frequency**: Annually
- **Notes**: Total number of registered private, commercial (incl. taxicabs), and public automobiles in the US every year (excluding buses, trucks, and motorcycles)
```{python}
reg_cars = pd.read_csv('data/us_registered_vehicles.csv', index_col='Year')
reg_cars.tail()
```
```{python}
reg_cars.plot(y='Registered Automobiles', figsize=(9, 4), color='k', linewidth=0.75);
```
Vehicle sales data (for macroeconomic covariate):
- [Light vehicle and total vehicle sales - Bureau of Economic Analysis/Wards Intelligence](https://www.bea.gov/data/consumer-spending/main)
- [Retail New Passenger Car Sales](https://www.bts.gov/content/retaila-new-passenger-car-sales-thousands-units)
- [Monthly Transportation Statistics](https://data.bts.gov/Research-and-Statistics/Monthly-Transportation-Statistics/crem-w557/about_data)
Data description:
- Autos--all passenger cars, including station wagons.
- Light trucks--trucks up to 14,000 pounds gross vehicle weight, including minivans and
- Sport utility vehicles. Prior to the 2003 Benchmark Revision light trucks were up to 10,000 pounds.
- Heavy trucks--trucks more than 14,000 pounds gross vehicle weight.
- **Columns**:
- Autos -- not seasonally adjusted (Thousands)
- Light Trucks -- not seasonally adjusted (Thousands)
- Light Total --not seasonally adjusted (Thousands)
- Total -- not seasonally adjusted (Thousands)
- Autos (SAAR) -- seasonally adjusted at annual rates (Millions)
- Light Trucks (SAAR) -- seasonally adjusted at annual rates (Millions)
- Light Total (SAAR) -- seasonally adjusted at annual rates (Millions)
- Total (SAAR) -- seasonally adjusted at annual rates (Millions)
```{python}
new_vehicle_sales = pd.read_csv(
'data/us_new_passenger_car_sales.csv',
index_col='Date',
parse_dates=True,
dayfirst=True)
new_vehicle_sales.tail()
```
```{python}
new_vehicle_sales.plot(y='Light Total', figsize=(9, 4), color='k', linewidth=0.75);
```
```{python}
new_vehicle_sales.plot(y='Light Total (SAAR)', figsize=(9, 4), color='k', linewidth=0.75);
```