website/seminars/seminar4/coding/Class04-InstrumentalVariables.Rmd at 7176e82092dbace3714e0e15e1fc14c530573487 · LSE-MY457/website · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
---
title: "MY457/MY557: Causal Inference for Experimental and Observational Studies"
subtitle: "Class 4: Instrumental Variables"
author: ""
date: ''
output:
  html_document: default
  pdf_document: default
header-includes:
  - \usepackage{tikz}
  - \usepackage{pgfplots}
---

```{r, include = FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

```{r,warning=F,message=F}
library(dplyr)
library(ggplot2)
library(AER)
```

# In-class exercise: Implementing intrumental variables estimation

In this exercise we try different ways of calculating the basic instrumental variables estimator. This is mainly for demonstration purposes. In practice, it is best to use the `ivreg` function, because it also implements the standard errors correctly. At first, let's start with creating a dataset.

```{r}
# SIMULATE DATA

# PARAMETERS
N <- 10000
U <- rnorm(N, mean = 5, sd = 3)
b0 <- 2
b1 <- 1.5

# POTENTIAL OUTCOMES
y0 <- b0 + b1 * U + rnorm(N)
y1 <- y0 + mean(y0) + rnorm(N)
y1[which(y0 < median(y0))] <- y1[which(y0 < median(y0))] / 2

#the mean of Y0 is the treatment effect

# CREATE DATAFRAME
df <- cbind(y0, y1, U) %>% as_tibble()

# GENERATE TYPES OF COMPLIANCE
type <- rep(NA, 10000)
type[which(y1 > median(y1))] <-
  sample(c(rep('Complier', 3500), rep('Always Taker', 1000), rep('Never Taker', 500)))
type[which(y1 < median(y1))] <-
  sample(c(rep('Complier', 1500), rep('Always Taker', 1500), rep('Never Taker', 2000)))
df$type <- type

#only three categories because defiers because of the monotonicity assumption

# CREATE INSTRUMENT
df$z <- sample(c(rep(0, 5000), rep(1, 5000)))

# CREATE TREATMENT ASSIGNMENT
df <- df %>%
  mutate(d = case_when(type == 'Always Taker' ~ 1,
                       type == 'Never Taker' ~ 0,
                       (type == 'Complier' & z == 0) ~ 0,
                       (type == 'Complier' & z == 1) ~ 1))

# REAL OUTCOMES
df <- df %>% mutate(y = case_when(d == 0 ~ y0, d == 1 ~ y1))
```


Before we estimate the treatment effect via the instrumental variable Z, let's use the potential outcomes to calculate the average treatment effects.
```{r}
# TRUE ATE
true_ate <- t.test(df$y1, df$y0, paired = TRUE)
true_ate
```

In contrast, the naive approach would be if we just regress the observed outcome on the treatment assignment indicator.
```{r}
# NAIVE "ATE" - naive comparison
naive_ate <- lm(y ~ d, data = df)

true_ate
summary(naive_ate)
```


Now, because we do not observe both potential outcomes, we need a different strategy. In the following, we will the instrumental variable approach to get a local average treatment effect (LATE). There are basically three different estimates we need to understand: (i) the effect of Z on Y (the intention-to-treat, ITT), (ii)  the effect of Z on D (estimated proportion of compliers), (iii) the Wald estimate.
```{r}
# 1. Effect of Z on Y
mean(df$y[df$z==1])-mean(df$y[df$z==0])
y.on.z <- lm(y ~ z, data = df)
summary(y.on.z)
itt_est <- coef(y.on.z)[2] # the intention to treat - they get encouraged but they are not forced to take the D

###

# 2. Effect of Z on D
mean(df$d[df$z==1])-mean(df$d[df$z==0])
d.on.z <- lm(d ~ z, data = df)
summary(d.on.z)
prop_compliers <- coef(d.on.z)[2] # proportion of compliers - important for [[Voting Paper]]

###

# 3. WALD ESTIMATE
itt_est/prop_compliers
```

However, you can also get the same number from the two-stage least squares (2SLS), (i) with the base-R lm function, or (ii) with the ivreg function.

```{r}
# 2SLS using lm
df$d_hat <- predict(d.on.z)
iv_2sls_2 <- lm(y ~ d_hat, data = df)
summary(iv_2sls_2)

# 2SLS using ivreg
iv_2sls <- ivreg(y ~ d | z, data = df)
summary(iv_2sls)
```

Looking at the data structure above, it is important to note that in a real-world application, one would only observe $Z$, $D$, and $Y$. We are only able to observe $Y_{0}$, $Y_{1}$, $U$, and $Type$ because this is a simulation exercise and we generated them ourselves.

Further, we can briefly examine the distribution of the true individual-level treatment effects according to type and treatment status.
```{r}
df <- df %>% mutate(y_diff = y1 - y0)
p1 <- ggplot(data = df, aes(x = y_diff, fill = type)) +
  geom_density(alpha = 0.2) + ggtitle("All Units")
p1
p2 <- ggplot(data = df[df$d == 1, ], aes(x = y_diff, fill = type)) +
  geom_density(alpha = 0.2) + ggtitle("Treated Units")
p2
p3 <- ggplot(data = df[df$d == 0, ], aes(x = y_diff, fill = type)) +
  geom_density(alpha = 0.2) + ggtitle("Control Units")
p3
```

The plots indicate that being a complier is associated with taking on relatively high values of the $Y_{1}$ potential outcome and therefore relatively high values of the true treatment effect. The bimodal distributions indicate that substantial proportions of each type take on both relatively low and relatively high values of the treatment effect, but clearly one side or the other dominates.