-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy path06_CommonMistakes.Rmd
218 lines (146 loc) · 13.1 KB
/
06_CommonMistakes.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
---
title: "01_CommonMistakes"
output: pdf_document
---
# (PART) Part I {-}
# Common Mistakes in SPC {#ch1_commonMistakes}
People are really, really good at finding patterns that aren't real, especially in noisy data.
Every metric has natural variation---*noise*---included as an inherent part of that process. True signals only emerge when you have properly characterized that variation. Statistical process control (SPC) charts---run charts and control charts---help characterize and identify non-random patterns that suggest the process has changed.
## Mistake #1
### Not using all the information that you have available to you {- #ch1_mistake1}
In essence, SPC tools help you evaluate the stability and predictability of a process or its outcomes. Statistical theory provides the basis to evaluate metric stability, and more confidently detect changes in the underlying process amongst the noise of natural variation.
Since it is impossible to account for every single variable that might influence a metric, we can use probability and statistics to evaluate how that metric naturally fluctuates over time (aka common cause variation), and construct guidelines around that fluctuation to help indicate when something in that process has changed (special cause variation).
Understanding natural, random variation in time series or sequential data is the essential point of quality assurance or process and outcome improvement efforts. It's a rookie mistake to use SPC tools to focus solely on the values themselves or their central tendency---instead evaluate [*all* of the information](#guidelines) of a run chart or control chart to understand what it's telling you.
<br>
\vspace{12pt}
<br>
\vspace{12pt}
## Mistake #2
### Forgetting that SPC charts cannot make decisions for you, they can only help *you* to make a decision {- #ch1_mistake2}
This example illustrates why it is important to think about your data and not blindly listening to the guidelines.
Figure 1 shows a process created using random numbers based on a pre-defined normal distribution. The black points and line are the data itself, the overall mean (*y* = 18) is represented by the grey line, and the overall distribution is shown in a histogram to the right of the run chart.
<br>
\vspace{12pt}
```{r ggmarg, fig.height=3, fig.align='center', echo=FALSE}
library(ggplot2)
library(ggExtra)
set.seed(250)
df = data.frame(x = seq(1:120), y = 18 + rnorm(120))
nat_var_run_plot <- ggplot2::ggplot(df, aes(x, y)) +
geom_point(size = 1) +
geom_hline(aes(yintercept = 18), color = "gray", size = 1) +
geom_line() +
ylim(14.75, 21.25) +
labs(x = "Subgroup", y = "Value", title = "A stable process created from random numbers") +
theme_bw()
ggExtra::ggMarginal(p = nat_var_run_plot, margins = "y", type = "histogram", binwidth = 0.5)
```
*Figure 1:* The axis labels are traditional SPC labels. The *value* (i.e., metric) is on the *y*-axis and the units of observation, traditionally called *subgroups*, are on the *x*-axis.
<br>
\vspace{12pt}
The term subgroup was developed in the context of an observation point involved sampling from a mechanical process, e.g., taking 5 widgets from a production of 500. Many SPC examples maintain this label regardless of what the *x*-axis is actually measuring for simplicity's sake --- we follow this convention where appropriate.
<br>
\vspace{12pt}
Another feature of SPC control charts are control limits. The symbol $\sigma$ is a measure of expected process standard deviation. Figure 2 adds control limits which are at $\pm$ 3$\sigma$.
<br>
\vspace{12pt}
```{r ggmarg_cc, fig.height=3, fig.align='center', echo=FALSE}
nat_var_cc_plot <- ggplot(df, aes(x, y)) +
geom_segment(aes(x = 1, xend = 120, y = 18, yend = 18), color = "gray", size = 2) +
geom_segment(aes(x = 1, xend = 120, y = 20.96, yend = 20.96), color = "red", size = 2) +
geom_segment(aes(x = 1, xend = 120, y = 15.1, yend = 15.1), color = "red", size = 2) +
geom_segment(aes(x = 1, xend = 120, y = 16.04, yend = 16.04), color = "grey") +
geom_segment(aes(x = 1, xend = 120, y = 17.02, yend = 17.02), color = "grey") +
geom_segment(aes(x = 1, xend = 120, y = 18.98, yend = 18.98), color = "grey") +
geom_segment(aes(x = 1, xend = 120, y = 19.96, yend = 19.96), color = "grey") +
# geom_ribbon(aes(ymin = 18.98, ymax = 19.96), alpha = 0.2) +
# geom_ribbon(aes(ymin = 16.04, ymax = 17.02), alpha = 0.2) +
annotate("text", x = 1, y = 21.25, label = "Upper Control Limit (UCL)", color = "red", hjust = 0, vjust = 0) +
annotate("text", x = 1, y = 14.75, label = "Lower Control Limit (LCL)", color = "red", hjust = 0, vjust = 1) +
annotate("text", x = 135, y = 18.98, label = as.character(expression("+1"~sigma)),
color = "gray30", parse = TRUE, hjust = 1) +
annotate("text", x = 135, y = 19.96, label = as.character(expression("+2"~sigma)),
color = "gray30", parse = TRUE, hjust = 1) +
annotate("text", x = 135, y = 20.96, label = as.character(expression("+3"~sigma)),
color = "gray30", parse = TRUE, hjust = 1) +
annotate("text", x = 135, y = 17.02, label = as.character(expression("-1"~sigma)),
color = "gray30", parse = TRUE, hjust = 1) +
annotate("text", x = 135, y = 16.04, label = as.character(expression("-2"~sigma)),
color = "gray30", parse = TRUE, hjust = 1) +
annotate("text", x = 135, y = 15.1, label = as.character(expression("-3"~sigma)),
color = "gray30", parse = TRUE, hjust = 1) +
annotate("text", x = 135, y = 18.05, label = "bar(x)", color = "gray30", parse = TRUE, hjust = 1) +
geom_point(size = 1) +
geom_line() +
ylim(14.25, 21.75) +
xlim(0, 135) +
labs(x = "Subgroup", y = "Value", title = "A stable process created from random numbers",
subtitle = "with control limits and standard deviation indicators") +
theme_bw()
nat_var_cc_plot
```
<br>
\vspace{12pt}
*Figure 2*: The same plot as in Figure 1, with standard deviation indicators and control limits added.
<br>
\vspace{12pt}
<br>
\vspace{12pt}
Guidelines on how to use these elements of SPC charts to evaluate the statistical process of a metric in more detail and determine whether to investigate the process for special cause variation are detailed in the section labeled [Guidelines for interpreting SPC charts](#guidelines) in the chapter Run Charts vs. Control Charts.
<br>
\vspace{12pt}
Applying the guidelines to the chart in Figure 2 reveals that some special cause variation has occurred in this data. Since this dataset was generated using random numbers from a known, stable, normal distribution, these are *False Positives*: the control chart suggests something has changed when in reality it hasn't. There is always a chance for *False Negatives*, as well, where something actually happened but the control chart didn't alert you to special cause variation.
<br>
\vspace{12pt}
Consider the matrix of possible outcomes for any given point in an SPC chart:
| | Reality: Something Happened | Reality: Nothing Happened |
| -------------- |:---------------:|:---------------:|
| **SPC: Alert** | True Positive | *False Positive* |
| **SPC: No alert** | *False Negative* | True Negative |
<br>
\vspace{12pt}
Using $\pm$ 3$\sigma$ control limits is standard, intended to balance the trade-offs between *False Negatives* and *False Positives*. If you prefer to err on the side of caution for a certain metric (such as in monitoring hospital acquired infections) and are willing to accept more *False Positives* to reduce *False Negatives*, you could use $\pm$ 2$\sigma$ control limits.
<br>
\vspace{12pt}
For other metrics where you prefer to be completely certain things are out of whack before taking action --- and are willing to accept more *False Negatives* you to reduce *False Positives* --- you could use $\pm$ 4$\sigma$ control limits.
<br>
\vspace{12pt}
**When in doubt, use $\pm$ 3$\sigma$ control limits.**
<br>
\vspace{12pt}
It's important to remember that SPC charts are at heart decision tools which can help you decide how to reduce false signals relative to your use case, but *they can never entirely eliminate false signals*. Thus, it's often useful to explicitly explore these trade-offs with stakeholders when deciding where and why to set control limits.
<br>
\vspace{12pt}
<br>
\vspace{12pt}
## Mistake #3
### Skipping to the end of the process {- #ch1_mistake3}
Run charts and control charts are the core tools of SPC analysis. Other basic statistical graphs---particularly line charts and histograms---are equally important to SPC work.
Line charts help you monitor any sort of metric, process, or time series data. Run charts and control charts are meant to help you identify departures from a **stable** process. Each uses a set of guidelines to help you make decisions on whether a process has changed or not.
In many cases, a run chart is all you need. In *all* cases, you should start with a line chart and histogram. If---and only if---the process is stable and you need to characterize the limits of natural variation, you can move on to using a control chart.
In addition, *never* rely on a table or year-to-date (YTD) comparisons to evaluate process performance. These approaches obscure the foundational concept of process control: that natural, common cause variation is an essential part of the process. Tables or YTD values can supplement run charts or control charts, but should never be used without them.
Above all, remember that the decisions you make in constructing SPC charts and associated data points (such as YTD figures) *will* impact the interpretation of the results. Bad charts can make for bad decisions.
<br>
\vspace{12pt}
<br>
\vspace{12pt}
## Mistake #4
### Not understanding what it means to have a "stable" process {- #ch1_mistake4}
It's common for stakeholders to want key performance indicators (KPIs) displayed using a control chart. However, control charts are only applicable when the business goal is to keep that KPI stable. SPC tools are built upon the fundamental assumption of a *stable* process, and as an analyst you need to be very clear on the definition of stability in the context of business goals and the statistical process of the metric itself. Because it takes time and resources to track KPIs (collecting the data, developing the dashboards, etc.) you should take time to develop them carefully by first ensuring that SPC tools are, in fact, an appropriate means to monitor that KPI. Let's look at two examples to get a more concrete understanding:
<br>
\vspace{12pt}
**Example 1:** Some outpatient specialties are facing increasing numbers of referrals, but are not getting more staff to handle them. With increasing patient demand and constrained hospital capacity, we would not expect the wait times for appointments to be constant over time. So, a KPI such as "percent of new patients seen within 2 weeks" might be a goal we care about, but since we expect that value to decline, it is not stable and a control chart is not appropriate.
However, if we define the KPI as something like "percent of new patients seen within 2 weeks *relative* to what we would expect given increased demand and no expansion", we have now placed it into a stable context. Instead of asking if the metric itself is declining, we're asking whether the system is responding as it has in the past. By defining the KPI in terms of something we would want to remain stable, we can now use a control chart to track its performance.
<br>
\vspace{12pt}
**Example 2:** Complaints about phone wait time for a call center has led to an increase in full-time employees to support call demand. You would expect the call center performance---perhaps measured in terms of "percent of calls answered in under 2 minutes"---to improve, thus a control chart is not appropriate. So, what would a "stable" KPI look like?
* Maybe it could be the performance of the various teams within the call center become more similar (e.g., decreased variability across teams).
* Maybe it could be the frequency of catastrophic events (e.g., people waiting longer than *X* minutes, where *X* is very large) staying below some threshold---similar to a "downtime" KPI used to track the stability of computer systems.
* Maybe it could be the percent change in the previously-defined KPI tracking the percent change in full-time employees (though we know this relationship is non-linear).
<br>
\vspace{12pt}
In both examples, it would not be appropriate to use a control chart for the previously-defined performance metrics, because we do not expect them (or necessarily want them) to be stable. However, by focusing on the process itself, we can define alternate KPIs that conform to the assumptions of a control chart.
<br>
\vspace{12pt}
*Stability* means that the system is responding as we would expect to the changing environment and that the system is robust to adverse surprises from the environment. **KPIs meant to evaluate stable processes should be specifically designed to track whether the system is stable and robust**, rather than focusing strictly on the outcome as defined by existing or previous KPIs.
Make sure that metrics meant to measure stability are properly designed from the outset before you spend large amounts of resources to develop and track them.