In this project, I moved beyond traditional prediction models to estimate the causal effect of two key marketing actions: the number of times a client is contacted and the length of the last phone call. Using the Bank Marketing Dataset, which includes detailed records of phone calls made by a European bank, I focused on understanding how these two factors affect whether a client subscribes to a term deposit. The goal was not just to predict outcomes, but to answer questions like: What happens if a client is contacted more times? Does a longer call increase the chance of subscription?
The first question I studied was: Does contacting a client more often increase or decrease the chance of subscription? I used logistic regression to estimate this effect while controlling for other factors like job, education, and month. The results showed that more contacts actually reduce the chance of subscription. To better understand this, I grouped the number of contacts into bins (e.g., 1, 2, 3, 4–5, etc.) and found that the subscription rate dropped after just a few contacts. This suggests that contacting clients too many times may reduce their interest or lead to annoyance.
I also checked whether this effect depends on how the client was contacted—by cellphone or telephone. I added an interaction term between number of contacts and contact method. The results showed that while telephone calls are generally less effective, the negative impact of repeated contact is smaller when calls are made by telephone. This means that the contact method changes how clients react to being contacted multiple times.
Next, I focused on call duration and asked: Does spending more time on a call increase the chance of subscription? I treated duration as a treatment variable and again used logistic regression with the same control variables. The result was clear: each extra minute of a call increases the odds of subscription by about 29%, and this effect was highly statistically significant. This shows that longer conversations are much more effective in convincing clients.
To get more flexible estimates, I used machine learning models. I applied LightGBM as an S-learner and also used Double Machine Learning (DML) to reduce bias when many variables are involved. I computed marginal treatment effects (MTE) for each person and compared how well different models ranked individuals based on their predicted benefit. I used cumulative gain curves to compare the performance. For the number of contacts, logistic regression gave the best results. But for call duration, LightGBM performed better, identifying clients more likely to benefit from longer calls.
In summary, this project helped me understand how marketing decisions like number of contacts and call duration affect customer behavior. I used both traditional and modern methods to estimate these effects. The results show that calling too many times can hurt performance, while spending more time during each call is very effective. These insights can help banks improve their marketing strategies and use their time more efficiently.