Skip to content

Commit e489b6d

Browse files
committed
add covariates lecture
1 parent fb7b651 commit e489b6d

15 files changed

+799
-337
lines changed
239 KB
Loading
191 KB
Loading
216 KB
Loading
175 KB
Loading

Lectures/Week 4/lec_07_covariates.Rmd

Lines changed: 144 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,12 @@
11
---
22
title: "Covariates in Time Series Models"
33
author: "Eli Holmes"
4-
date: "18 Apr 2023"
4+
date: "22 Apr 2025"
55
output:
66
ioslides_presentation:
7+
mathjax: "https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-chtml-full.js"
8+
widescreen: true
79
css: lecture_slides.css
8-
smaller: true
910
beamer_presentation: default
1011
subtitle: FISH 550 – Applied Time Series Analysis
1112
---
@@ -23,23 +24,38 @@ knitr::opts_chunk$set(fig.height=5, fig.align="center")
2324

2425
* Why include covariates?
2526

26-
* Multivariate linear regression on time series data
27+
* Multivariate regression on time series data
28+
29+
- Regular regression
30+
- Regression with auto-correlated errors
2731

2832
* Covariates in MARSS models
2933

3034
* Seasonality in MARSS models
3135

3236
* Missing covariates
3337

38+
## Reading
39+
40+
[ATSA Lab Book, MARSS + Covariates](https://atsa-es.github.io/atsa-labs/chap-msscov.html), [Seasonal effects](https://atsa-es.github.io/atsa-labs/sec-msscov-season.html#sec-msscov-season-fourier), [Modeling changing seasonality](https://atsa-es.github.io/atsa-labs/chap-seasonal-dlm.html)
41+
42+
[Fishery Catch Forecasting](https://fish-forecast.github.io/Fish-Forecast-Bookdown/index.html): Replicates and discusses the work in Stergiou and Christou (1996) Modelling and forecasting annual fisheries catches: comparison of regression, univariate and multivariate time series methods. Fisheries Research 25: 105-136.
43+
44+
<center>
45+
![](images/fish-forecast.png){width=25%}
46+
</center>
47+
48+
49+
3450

3551
## Why include covariates in a model?
3652

37-
* You want to forecast something using covariates
38-
* We are often interested in knowing the cause of variation
39-
* Covariates can explain the process that generated the patterns
40-
* Covariates can help deal with problematic observation errors
41-
* You are using covariates to model a changing system
42-
* You want to get rid of trends or cycles
53+
* You want to forecast something using covariates...but you don't actually care about the covariates.
54+
* You want to forecast something using covariates...and you do care about the covariates.
55+
* You are trying to understand the **cause of variation**. You are interested in the covariates effects because they can explain the process that generated the patterns.
56+
* You are using covariates to model a changing system.
57+
* You are using covariates to help deal with problematic observation errors.
58+
* You want to get rid of trends or cycles.
4359

4460
## Lake WA plankton and covariates
4561

@@ -50,8 +66,33 @@ knitr::include_graphics(here::here("Lectures", "Week 4", "images", "msscov-plank
5066

5167
## Covariates in time series models
5268

53-
* Multivariate linear regression for time series data
54-
* Linear regression with ARMA errors
69+
### Multivariate regression for time series data
70+
71+
$$y_t = f(z_{1,t}, z_{2,t}, z_{3}, \dots) + \epsilon_t$$
72+
73+
* Classic models: linear, GAMs, GLMs, etc, etc. $\epsilon$ is uncorrelated.
74+
* Regression with auto-correlated errors
75+
$$y_t = f(z_{1,t}, z_{2,t}, z_{3}, \dots) + \epsilon_t$$
76+
where $\epsilon$ is auto-correlated.
77+
- Linear regression with ARMA errors
78+
79+
## Covariates in time series models
80+
81+
### Observation errors driven by covariates
82+
83+
$$
84+
\begin{gathered}
85+
y_t = x_t + (d_t + v_t) \\
86+
v_t \sim N(0,\sigma_r)
87+
\end{gathered}
88+
$$
89+
90+
## Covariates in time series models
91+
92+
### Process errors driven by covariates
93+
94+
$$x_{t} = x_{t-1} + (c_t + e_t)$$
95+
5596
* ARMAX - process errors driven by covariates
5697
* MARSS models with covariates = process and observation errors affected by covariates
5798
* aka Vector Autoregressive Models with covariates and observation error
@@ -133,8 +174,7 @@ autoplot(uschange[,1:2], facets=TRUE) +
133174

134175
```{r out.width="70%"}
135176
y <- uschange[,"Consumption"]; d <- uschange[,"Income"]
136-
fit <- lm(y~d)
137-
checkresiduals(fit)
177+
fit <- lm(y~d); checkresiduals(fit)
138178
```
139179

140180
## Let `auto.arima()` find best model
@@ -146,7 +186,9 @@ checkresiduals(fit)
146186

147187
## Collinearity
148188

149-
This a big issue. If you are thinking about stepwise variable selection, do a literature search on the issue. Read the chapter in [Holmes 2018: Chap 6](https://fish-forecast.github.io/Fish-Forecast-Bookdown/6-1-multivariate-linear-regression.html) on catch forecasting models using multivariate regression for a discussion of
189+
This a big issue. Always do a simple `pairs()` plot (or similar) on your explanatory variables.
190+
191+
Read the chapter in [Holmes 2018: Chap 6](https://fish-forecast.github.io/Fish-Forecast-Bookdown/6-covariates.html) on catch forecasting models for a discussion of
150192

151193
* Stepwise variable regression in R
152194
* Cross-validation for regression models
@@ -156,6 +198,39 @@ This a big issue. If you are thinking about stepwise variable selection, do a l
156198
* Elastic Net
157199
* Diagnostics
158200

201+
## Simple diagnostics `pairs()`
202+
203+
```
204+
X = matrix (or data frame) with vars in columns
205+
pairs(X)
206+
```
207+
<center>
208+
![](images/pairs-fishery.png){width=50%}
209+
</center>
210+
211+
212+
## Simple diagnostics `pairs()`
213+
214+
```
215+
X = matrix (or data frame) with vars in columns
216+
pairs(X)
217+
```
218+
<center>
219+
![](images/pairs-env.png){width=50%}
220+
</center>
221+
222+
223+
## Simple diagnostics `corrplot()`
224+
225+
```
226+
library(corrplot)
227+
corrplot::corrplot(cor(X))
228+
```
229+
<center>
230+
![](images/corrplot.png){width=50%}
231+
</center>
232+
233+
159234
## ARMAX
160235

161236
ARMAX models are different. In this case, the covariates affect the amount the auto-regressive process changes each time step.
@@ -173,7 +248,7 @@ $$x_t = b x_{t-1}+ \underbrace{\boxed{\mathbf{C} \mathbf{c}_t}}_{\text{drift "u"
173248

174249
## Covariates in MARSS models
175250

176-
This is a state-space model that allows you to have the covariate affects process error and affect observation errors.
251+
This is a state-space model that allows you to have the covariate affect process error and affect observation error.
177252

178253
$$\mathbf{x}_t = \mathbf{B} \mathbf{x}_{t-1} + \mathbf{u} +\mathbf{C} \mathbf{c}_t + \mathbf{w}_t$$
179254
$$\mathbf{y}_t = \mathbf{Z} \mathbf{x}_{t} + \mathbf{a} + \mathbf{D} \mathbf{d}_t + \mathbf{v}_t$$
@@ -189,12 +264,14 @@ Now we can model how covariates affect the hidden process.
189264

190265
## Example - univariate state-space models
191266

192-
$$x_t = x_{t-1} + u + \boxed{\mathbf{C} \mathbf{c}_t} + w_t$$
267+
$$x_t = x_{t-1} + \boxed{u + \mathbf{C} \mathbf{c}_t} + w_t$$
193268
$$y_t = x_t + v_t$$
194269

195-
Random walk with drift. How does covariate affect the drift term?
270+
Random walk with drift. How do the covariates affect the drift term?
271+
272+
Example. You have tag data on movement of animals in the ocean. How does water temperature and current affect the speed of the movement.
196273

197-
Example. You have tag data on movement of animals in the ocean. How does water temperature affect the speed (jump length) of the movement.
274+
$$x_t = x_{t-1} + u + \begin{bmatrix}C_a & C_b\end{bmatrix}\begin{bmatrix}temp \\ curr\end{bmatrix}_t + w_t$$
198275

199276
## Example - univariate state-space models
200277

@@ -204,7 +281,9 @@ $$y_t = x_t + \boxed{\mathbf{D} \mathbf{d}_t + v_t}$$
204281

205282
How does covariate affect observation error relative to our stochastic trend.
206283

207-
Example. You are tracking population size using stream surveys. Turbidity affects your observation error.
284+
Example. You are tracking population size using stream surveys. Turbidity and temperature affects your observation error.
285+
286+
$$y_t = x_t + \begin{bmatrix}D_a & D_b\end{bmatrix}\begin{bmatrix}temp \\ turb\end{bmatrix}_t + v_t$$
208287

209288

210289
## Multivariate Example - Covariates in state process
@@ -236,8 +315,6 @@ The structure of $\mathbf{C}$ can model different effect structures
236315

237316
$$\begin{bmatrix}C & C \\ C & C\end{bmatrix}\begin{bmatrix}temp \\ TP\end{bmatrix}_t$$
238317

239-
##
240-
241318
**Effect of temperature and TP is different but the same across sites, species, whatever the row in $\mathbf{x}$ is**
242319

243320
$$\begin{bmatrix}C_a & C_b \\ C_a & C_b\end{bmatrix}\begin{bmatrix}temp \\ TP\end{bmatrix}_t$$
@@ -248,8 +325,6 @@ $$\begin{bmatrix}C_a & C_b \\ C_a & C_b\end{bmatrix}\begin{bmatrix}temp \\ TP\en
248325

249326
$$\begin{bmatrix}C_{a1} & C_{b1} \\ C_{a2} & C_{b2}\end{bmatrix}\begin{bmatrix}temp \\ TP\end{bmatrix}_t$$
250327

251-
##
252-
253328
**Effect of temperature is the same across sites but TP is not**
254329

255330
$$\begin{bmatrix}C_{a} & C_{b1} \\ C_{a} & C_{b2}\end{bmatrix}\begin{bmatrix}temp \\ TP\end{bmatrix}_t$$
@@ -295,25 +370,7 @@ $$\begin{bmatrix}D_a & D_b \\ D_a & D_b \\ D_c & 0\end{bmatrix}\begin{bmatrix}te
295370
* We want to remove seasonality or cycles
296371

297372

298-
## Why include covariates in a observation?
299-
300-
**Auto-correlated observation errors**
301-
302-
* Model your $v_t$ as a AR-1 process. hard numerically with a large multivariate state-space model
303-
304-
* If know what is causing the auto-correlation, include that as a covariate. Easier.
305-
306-
**Correlated observation errors across sites or species (y rows)**
307-
308-
* Use a $\mathbf{R}$ matrix with off-diagonal terms. really hard numerically
309-
310-
* If you know or suspect what is causing the correlation, include that as a covariate. Easier.
311-
312-
**We want to remove seasonality or cycles**
313-
314-
"hard numerically" = you need a lot of data
315-
316-
## Let's work through an example
373+
## A worked through an example
317374

318375
`lec_07_covariates.R` in the Fish550 repo
319376

@@ -322,7 +379,7 @@ Follows [Chapter 8](https://atsa-es.github.io/atsa-labs/chap-msscov.html) in the
322379

323380
## Seasonality
324381

325-
```{r chinookplot, echo=FALSE}
382+
```{r chinookplot, echo=FALSE, message=FALSE, warning=FALSE}
326383
library(atsalibrary)
327384
data(chinook, package="atsalibrary")
328385
par(mfrow=c(1,2))
@@ -406,7 +463,7 @@ $$
406463
##
407464

408465
```{r}
409-
TT <- nrow(chinook.month)/2
466+
TT <- 50 # whatever the time points in your data
410467
covariate <- matrix(0, 12, TT)
411468
monrow <- match(chinook.month$Month, month.abb)[1:TT]
412469
covariate[cbind(monrow,1:TT)] <- 1
@@ -489,8 +546,9 @@ $$
489546
##
490547

491548
```{r}
492-
TT <- nrow(chinook.month)/2
493-
monrow <- match(chinook.month$Month, month.abb)[1:TT]
549+
TT <- 50 # whatever the number of time points in your dat
550+
# a number 1 to 12 to match the month of time t
551+
monrow <- rep(1:12, TT)[1:TT]
494552
covariate <- rbind(monrow, monrow^2, monrow^3)
495553
rownames(covariate) <- c("m", "m2", "m3")
496554
covariate[,1:14]
@@ -513,6 +571,20 @@ C <- matrix(c("m", "m2", "m3"), 2, 3, byrow = TRUE)
513571
C
514572
```
515573

574+
## Orthonal polynomials
575+
576+
577+
In practice use `poly(x, 3)` to generate orthogonal polynomials.
578+
579+
```{r echo=FALSE}
580+
library(corrplot)
581+
```
582+
583+
```{r}
584+
x <- rep(1:12); par(mfrow = c(1, 2))
585+
corrplot(cor(data.frame(x, x^2, x^3))); corrplot(cor(poly(x, 3)))
586+
```
587+
516588
## Season as a Fourier series
517589

518590
* Fourier series are paired sets of sine and cosine waves
@@ -531,7 +603,7 @@ where $p$ is 12 (for monthly).
531603
##
532604

533605
```{r echo=FALSE}
534-
TT <- nrow(chinook.month)/2
606+
TT <- 50 # whatever the length of your data
535607
covariate <- rbind(sin(2*pi*(1:TT)/12), cos(2*pi*(1:TT)/12))
536608
plot(covariate[1,1:50], type="l", ylab="covariate", xlab="t", ylim = c(-2, 2))
537609
lines(covariate[2,1:50], col="red")
@@ -636,13 +708,19 @@ C
636708

637709
**[Seasonality of Lake WA plankton](https://atsa-es.github.io/atsa-labs/sec-msscov-season.html)**
638710

639-
![](images/msscov-mon-effects-1.png){width=75%}
711+
<center>
712+
![](images/msscov-mon-effects-1.png){width=60%}
713+
</center>
714+
640715

641716
## Cyclic salmon
642717

643718
**[Chapter 16](https://atsa-es.github.io/atsa-labs/chap-cyclic-sockeye.html)**
644719

645-
![](https://atsa-es.github.io/atsa-labs/Applied_Time_Series_Analysis_files/figure-html/unnamed-chunk-59-1.png)
720+
<center>
721+
![](https://atsa-es.github.io/atsa-labs/Applied_Time_Series_Analysis_files/figure-html/unnamed-chunk-59-1.png){width=60%}
722+
</center>
723+
646724

647725

648726
## Missing covariates
@@ -651,16 +729,30 @@ C
651729

652730
https://atsa-es.github.io/atsa-labs/example-snotel-data.html
653731

654-
![](images/snotelsites.png){width=70%}
732+
<center>
733+
![](images/snotelsites.png){width=60%}
734+
</center>
735+
655736

656737
## Snow Water Equivalent (snowpack)
657738

658739
February snowpack estimates
659740

660-
![](images/snoteldata.png){width=75%}
741+
<center>
742+
![](images/snoteldata.png){width=60%}
743+
</center>
744+
661745

662746
## Use MARSS models Chapter 11
663747

664748
* Accounts for correlation across sites and local variability
665749

666-
![](images/snotelfits.png){width=75%}
750+
<center>
751+
![](images/snotelfits.png){width=60%}
752+
</center>
753+
754+
755+
756+
757+
758+

0 commit comments

Comments
 (0)