Skip to content

Commit f7d7153

Browse files
Fix refs (#59)
* authorship * fixing \ref and \eqref * fixing sections in ipynb * fixing pagerefs
1 parent d51c56b commit f7d7153

24 files changed

+2280
-1967
lines changed

Ch02-statlearn-lab.ipynb

Lines changed: 1295 additions & 1291 deletions
Large diffs are not rendered by default.

Ch03-linreg-lab.Rmd

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -343,7 +343,7 @@ As mentioned above, there is an existing function to add a line to a plot --- `a
343343

344344

345345
Next we examine some diagnostic plots, several of which were discussed
346-
in Section~\ref{Ch3:problems.sec}.
346+
in Section 3.3.3.
347347
We can find the fitted values and residuals
348348
of the fit as attributes of the `results` object.
349349
Various influence measures describing the regression model
@@ -440,7 +440,7 @@ We can access the individual components of `results` by name
440440
and
441441
`np.sqrt(results.scale)` gives us the RSE.
442442

443-
Variance inflation factors (section~\ref{Ch3:problems.sec}) are sometimes useful
443+
Variance inflation factors (section 3.3.3) are sometimes useful
444444
to assess the effect of collinearity in the model matrix of a regression model.
445445
We will compute the VIFs in our multiple regression fit, and use the opportunity to introduce the idea of *list comprehension*.
446446

Ch03-linreg-lab.ipynb

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1533,7 +1533,7 @@
15331533
"metadata": {},
15341534
"source": [
15351535
"Next we examine some diagnostic plots, several of which were discussed\n",
1536-
"in Section~\\ref{Ch3:problems.sec}.\n",
1536+
"in Section 3.3.3.\n",
15371537
"We can find the fitted values and residuals\n",
15381538
"of the fit as attributes of the `results` object.\n",
15391539
"Various influence measures describing the regression model\n",
@@ -2142,7 +2142,7 @@
21422142
"and\n",
21432143
"`np.sqrt(results.scale)` gives us the RSE.\n",
21442144
"\n",
2145-
"Variance inflation factors (section~\\ref{Ch3:problems.sec}) are sometimes useful\n",
2145+
"Variance inflation factors (section 3.3.3) are sometimes useful\n",
21462146
"to assess the effect of collinearity in the model matrix of a regression model.\n",
21472147
"We will compute the VIFs in our multiple regression fit, and use the opportunity to introduce the idea of *list comprehension*.\n",
21482148
"\n",

Ch04-classification-lab.Rmd

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -405,7 +405,7 @@ lda.fit(X_train, L_train)
405405
406406
```
407407
Here we have used the list comprehensions introduced
408-
in Section~\ref{Ch3-linreg-lab:multivariate-goodness-of-fit}. Looking at our first line above, we see that the right-hand side is a list
408+
in Section 3.6.4. Looking at our first line above, we see that the right-hand side is a list
409409
of length two. This is because the code `for M in [X_train, X_test]` iterates over a list
410410
of length two. While here we loop over a list,
411411
the list comprehension method works when looping over any iterable object.
@@ -454,7 +454,7 @@ lda.scalings_
454454
455455
```
456456

457-
These values provide the linear combination of `Lag1` and `Lag2` that are used to form the LDA decision rule. In other words, these are the multipliers of the elements of $X=x$ in (\ref{Ch4:bayes.multi}).
457+
These values provide the linear combination of `Lag1` and `Lag2` that are used to form the LDA decision rule. In other words, these are the multipliers of the elements of $X=x$ in (4.24).
458458
If $-0.64\times `Lag1` - 0.51 \times `Lag2` $ is large, then the LDA classifier will predict a market increase, and if it is small, then the LDA classifier will predict a market decline.
459459

460460
```{python}
@@ -463,7 +463,7 @@ lda_pred = lda.predict(X_test)
463463
```
464464

465465
As we observed in our comparison of classification methods
466-
(Section~\ref{Ch4:comparison.sec}), the LDA and logistic
466+
(Section 4.5), the LDA and logistic
467467
regression predictions are almost identical.
468468

469469
```{python}
@@ -522,7 +522,7 @@ The LDA classifier above is the first classifier from the
522522
`sklearn` library. We will use several other objects
523523
from this library. The objects
524524
follow a common structure that simplifies tasks such as cross-validation,
525-
which we will see in Chapter~\ref{Ch5:resample}. Specifically,
525+
which we will see in Chapter 5. Specifically,
526526
the methods first create a generic classifier without
527527
referring to any data. This classifier is then fit
528528
to data with the `fit()` method and predictions are
@@ -808,7 +808,7 @@ feature_std.std()
808808
809809
```
810810

811-
Notice that the standard deviations are not quite $1$ here; this is again due to some procedures using the $1/n$ convention for variances (in this case `scaler()`), while others use $1/(n-1)$ (the `std()` method). See the footnote on page~\pageref{Ch4-varformula}.
811+
Notice that the standard deviations are not quite $1$ here; this is again due to some procedures using the $1/n$ convention for variances (in this case `scaler()`), while others use $1/(n-1)$ (the `std()` method). See the footnote on page 183.
812812
In this case it does not matter, as long as the variables are all on the same scale.
813813

814814
Using the function `train_test_split()` we now split the observations into a test set,
@@ -875,7 +875,7 @@ This is double the rate that one would obtain from random guessing.
875875
The number of neighbors in KNN is referred to as a *tuning parameter*, also referred to as a *hyperparameter*.
876876
We do not know *a priori* what value to use. It is therefore of interest
877877
to see how the classifier performs on test data as we vary these
878-
parameters. This can be achieved with a `for` loop, described in Section~\ref{Ch2-statlearn-lab:for-loops}.
878+
parameters. This can be achieved with a `for` loop, described in Section 2.3.8.
879879
Here we use a for loop to look at the accuracy of our classifier in the group predicted to purchase
880880
insurance as we vary the number of neighbors from 1 to 5:
881881

@@ -902,7 +902,7 @@ As a comparison, we can also fit a logistic regression model to the
902902
data. This can also be done
903903
with `sklearn`, though by default it fits
904904
something like the *ridge regression* version
905-
of logistic regression, which we introduce in Chapter~\ref{Ch6:varselect}. This can
905+
of logistic regression, which we introduce in Chapter 6. This can
906906
be modified by appropriately setting the argument `C` below. Its default
907907
value is 1 but by setting it to a very large number, the algorithm converges to the same solution as the usual (unregularized)
908908
logistic regression estimator discussed above.
@@ -946,7 +946,7 @@ confusion_table(logit_labels, y_test)
946946
947947
```
948948
## Linear and Poisson Regression on the Bikeshare Data
949-
Here we fit linear and Poisson regression models to the `Bikeshare` data, as described in Section~\ref{Ch4:sec:pois}.
949+
Here we fit linear and Poisson regression models to the `Bikeshare` data, as described in Section 4.6.
950950
The response `bikers` measures the number of bike rentals per hour
951951
in Washington, DC in the period 2010--2012.
952952

@@ -987,7 +987,7 @@ variables constant, there are on average about 7 more riders in
987987
February than in January. Similarly there are about 16.5 more riders
988988
in March than in January.
989989

990-
The results seen in Section~\ref{sec:bikeshare.linear}
990+
The results seen in Section 4.6.1
991991
used a slightly different coding of the variables `hr` and `mnth`, as follows:
992992

993993
```{python}
@@ -1041,7 +1041,7 @@ np.allclose(M_lm.fittedvalues, M2_lm.fittedvalues)
10411041
```
10421042

10431043

1044-
To reproduce the left-hand side of Figure~\ref{Ch4:bikeshare}
1044+
To reproduce the left-hand side of Figure 4.13
10451045
we must first obtain the coefficient estimates associated with
10461046
`mnth`. The coefficients for January through November can be obtained
10471047
directly from the `M2_lm` object. The coefficient for December
@@ -1081,7 +1081,7 @@ ax_month.set_ylabel('Coefficient', fontsize=20);
10811081
10821082
```
10831083

1084-
Reproducing the right-hand plot in Figure~\ref{Ch4:bikeshare} follows a similar process.
1084+
Reproducing the right-hand plot in Figure 4.13 follows a similar process.
10851085

10861086
```{python}
10871087
coef_hr = S2[S2.index.str.contains('hr')]['coef']
@@ -1116,7 +1116,7 @@ M_pois = sm.GLM(Y, X2, family=sm.families.Poisson()).fit()
11161116
11171117
```
11181118

1119-
We can plot the coefficients associated with `mnth` and `hr`, in order to reproduce Figure~\ref{Ch4:bikeshare.pois}. We first complete these coefficients as before.
1119+
We can plot the coefficients associated with `mnth` and `hr`, in order to reproduce Figure 4.15. We first complete these coefficients as before.
11201120

11211121
```{python}
11221122
S_pois = summarize(M_pois)

Ch04-classification-lab.ipynb

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2007,7 +2007,7 @@
20072007
"metadata": {},
20082008
"source": [
20092009
"Here we have used the list comprehensions introduced\n",
2010-
"in Section~\\ref{Ch3-linreg-lab:multivariate-goodness-of-fit}. Looking at our first line above, we see that the right-hand side is a list\n",
2010+
"in Section 3.6.4. Looking at our first line above, we see that the right-hand side is a list\n",
20112011
"of length two. This is because the code `for M in [X_train, X_test]` iterates over a list\n",
20122012
"of length two. While here we loop over a list,\n",
20132013
"the list comprehension method works when looping over any iterable object.\n",
@@ -2173,7 +2173,7 @@
21732173
"id": "f0a4abaf",
21742174
"metadata": {},
21752175
"source": [
2176-
"These values provide the linear combination of `Lag1` and `Lag2` that are used to form the LDA decision rule. In other words, these are the multipliers of the elements of $X=x$ in (\\ref{Ch4:bayes.multi}).\n",
2176+
"These values provide the linear combination of `Lag1` and `Lag2` that are used to form the LDA decision rule. In other words, these are the multipliers of the elements of $X=x$ in (4.24).\n",
21772177
" If $-0.64\\times `Lag1` - 0.51 \\times `Lag2` $ is large, then the LDA classifier will predict a market increase, and if it is small, then the LDA classifier will predict a market decline."
21782178
]
21792179
},
@@ -2200,7 +2200,7 @@
22002200
"metadata": {},
22012201
"source": [
22022202
"As we observed in our comparison of classification methods\n",
2203-
" (Section~\\ref{Ch4:comparison.sec}), the LDA and logistic\n",
2203+
" (Section 4.5), the LDA and logistic\n",
22042204
"regression predictions are almost identical."
22052205
]
22062206
},
@@ -2421,7 +2421,7 @@
24212421
"`sklearn` library. We will use several other objects\n",
24222422
"from this library. The objects\n",
24232423
"follow a common structure that simplifies tasks such as cross-validation,\n",
2424-
"which we will see in Chapter~\\ref{Ch5:resample}. Specifically,\n",
2424+
"which we will see in Chapter 5. Specifically,\n",
24252425
"the methods first create a generic classifier without\n",
24262426
"referring to any data. This classifier is then fit\n",
24272427
"to data with the `fit()` method and predictions are\n",
@@ -4349,7 +4349,7 @@
43494349
"id": "c225f2b2",
43504350
"metadata": {},
43514351
"source": [
4352-
"Notice that the standard deviations are not quite $1$ here; this is again due to some procedures using the $1/n$ convention for variances (in this case `scaler()`), while others use $1/(n-1)$ (the `std()` method). See the footnote on page~\\pageref{Ch4-varformula}.\n",
4352+
"Notice that the standard deviations are not quite $1$ here; this is again due to some procedures using the $1/n$ convention for variances (in this case `scaler()`), while others use $1/(n-1)$ (the `std()` method). See the footnote on page 183.\n",
43534353
"In this case it does not matter, as long as the variables are all on the same scale.\n",
43544354
"\n",
43554355
"Using the function `train_test_split()` we now split the observations into a test set,\n",
@@ -4570,7 +4570,7 @@
45704570
"The number of neighbors in KNN is referred to as a *tuning parameter*, also referred to as a *hyperparameter*.\n",
45714571
"We do not know *a priori* what value to use. It is therefore of interest\n",
45724572
"to see how the classifier performs on test data as we vary these\n",
4573-
"parameters. This can be achieved with a `for` loop, described in Section~\\ref{Ch2-statlearn-lab:for-loops}.\n",
4573+
"parameters. This can be achieved with a `for` loop, described in Section 2.3.8.\n",
45744574
"Here we use a for loop to look at the accuracy of our classifier in the group predicted to purchase\n",
45754575
"insurance as we vary the number of neighbors from 1 to 5:"
45764576
]
@@ -4629,7 +4629,7 @@
46294629
"data. This can also be done\n",
46304630
"with `sklearn`, though by default it fits\n",
46314631
"something like the *ridge regression* version\n",
4632-
"of logistic regression, which we introduce in Chapter~\\ref{Ch6:varselect}. This can\n",
4632+
"of logistic regression, which we introduce in Chapter 6. This can\n",
46334633
"be modified by appropriately setting the argument `C` below. Its default\n",
46344634
"value is 1 but by setting it to a very large number, the algorithm converges to the same solution as the usual (unregularized)\n",
46354635
"logistic regression estimator discussed above.\n",
@@ -4849,7 +4849,7 @@
48494849
"metadata": {},
48504850
"source": [
48514851
"## Linear and Poisson Regression on the Bikeshare Data\n",
4852-
"Here we fit linear and Poisson regression models to the `Bikeshare` data, as described in Section~\\ref{Ch4:sec:pois}.\n",
4852+
"Here we fit linear and Poisson regression models to the `Bikeshare` data, as described in Section 4.6.\n",
48534853
"The response `bikers` measures the number of bike rentals per hour\n",
48544854
"in Washington, DC in the period 2010--2012."
48554855
]
@@ -5322,7 +5322,7 @@
53225322
"February than in January. Similarly there are about 16.5 more riders\n",
53235323
"in March than in January.\n",
53245324
"\n",
5325-
"The results seen in Section~\\ref{sec:bikeshare.linear}\n",
5325+
"The results seen in Section 4.6.1\n",
53265326
"used a slightly different coding of the variables `hr` and `mnth`, as follows:"
53275327
]
53285328
},
@@ -5834,7 +5834,7 @@
58345834
"id": "41fb2787",
58355835
"metadata": {},
58365836
"source": [
5837-
"To reproduce the left-hand side of Figure~\\ref{Ch4:bikeshare}\n",
5837+
"To reproduce the left-hand side of Figure 4.13\n",
58385838
"we must first obtain the coefficient estimates associated with\n",
58395839
"`mnth`. The coefficients for January through November can be obtained\n",
58405840
"directly from the `M2_lm` object. The coefficient for December\n",
@@ -5988,7 +5988,7 @@
59885988
"id": "6c68761a",
59895989
"metadata": {},
59905990
"source": [
5991-
"Reproducing the right-hand plot in Figure~\\ref{Ch4:bikeshare} follows a similar process."
5991+
"Reproducing the right-hand plot in Figure 4.13 follows a similar process."
59925992
]
59935993
},
59945994
{
@@ -6088,7 +6088,7 @@
60886088
"id": "8552fb8b",
60896089
"metadata": {},
60906090
"source": [
6091-
"We can plot the coefficients associated with `mnth` and `hr`, in order to reproduce Figure~\\ref{Ch4:bikeshare.pois}. We first complete these coefficients as before."
6091+
"We can plot the coefficients associated with `mnth` and `hr`, in order to reproduce Figure 4.15. We first complete these coefficients as before."
60926092
]
60936093
},
60946094
{

Ch05-resample-lab.Rmd

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -237,7 +237,7 @@ for i, d in enumerate(range(1,6)):
237237
cv_error
238238
239239
```
240-
As in Figure~\ref{Ch5:cvplot}, we see a sharp drop in the estimated test MSE between the linear and
240+
As in Figure 5.4, we see a sharp drop in the estimated test MSE between the linear and
241241
quadratic fits, but then no clear improvement from using higher-degree polynomials.
242242

243243
Above we introduced the `outer()` method of the `np.power()`
@@ -278,7 +278,7 @@ cv_error
278278
Notice that the computation time is much shorter than that of LOOCV.
279279
(In principle, the computation time for LOOCV for a least squares
280280
linear model should be faster than for $k$-fold CV, due to the
281-
availability of the formula~(\ref{Ch5:eq:LOOCVform}) for LOOCV;
281+
availability of the formula~(5.2) for LOOCV;
282282
however, the generic `cross_validate()` function does not make
283283
use of this formula.) We still see little evidence that using cubic
284284
or higher-degree polynomial terms leads to a lower test error than simply
@@ -325,7 +325,7 @@ incurred by picking different random folds.
325325

326326
## The Bootstrap
327327
We illustrate the use of the bootstrap in the simple example
328-
{of Section~\ref{Ch5:sec:bootstrap},} as well as on an example involving
328+
{of Section 5.2,} as well as on an example involving
329329
estimating the accuracy of the linear regression model on the `Auto`
330330
data set.
331331
### Estimating the Accuracy of a Statistic of Interest
@@ -340,8 +340,8 @@ in a dataframe.
340340
To illustrate the bootstrap, we
341341
start with a simple example.
342342
The `Portfolio` data set in the `ISLP` package is described
343-
in Section~\ref{Ch5:sec:bootstrap}. The goal is to estimate the
344-
sampling variance of the parameter $\alpha$ given in formula~(\ref{Ch5:min.var}). We will
343+
in Section 5.2. The goal is to estimate the
344+
sampling variance of the parameter $\alpha$ given in formula~(5.7). We will
345345
create a function
346346
`alpha_func()`, which takes as input a dataframe `D` assumed
347347
to have columns `X` and `Y`, as well as a
@@ -360,7 +360,7 @@ def alpha_func(D, idx):
360360
```
361361
This function returns an estimate for $\alpha$
362362
based on applying the minimum
363-
variance formula (\ref{Ch5:min.var}) to the observations indexed by
363+
variance formula (5.7) to the observations indexed by
364364
the argument `idx`. For instance, the following command
365365
estimates $\alpha$ using all 100 observations.
366366

@@ -430,7 +430,7 @@ intercept and slope terms for the linear regression model that uses
430430
`horsepower` to predict `mpg` in the `Auto` data set. We
431431
will compare the estimates obtained using the bootstrap to those
432432
obtained using the formulas for ${\rm SE}(\hat{\beta}_0)$ and
433-
${\rm SE}(\hat{\beta}_1)$ described in Section~\ref{Ch3:secoefsec}.
433+
${\rm SE}(\hat{\beta}_1)$ described in Section 3.1.2.
434434

435435
To use our `boot_SE()` function, we must write a function (its
436436
first argument)
@@ -499,7 +499,7 @@ This indicates that the bootstrap estimate for ${\rm SE}(\hat{\beta}_0)$ is
499499
0.85, and that the bootstrap
500500
estimate for ${\rm SE}(\hat{\beta}_1)$ is
501501
0.0074. As discussed in
502-
Section~\ref{Ch3:secoefsec}, standard formulas can be used to compute
502+
Section 3.1.2, standard formulas can be used to compute
503503
the standard errors for the regression coefficients in a linear
504504
model. These can be obtained using the `summarize()` function
505505
from `ISLP.sm`.
@@ -513,21 +513,21 @@ model_se
513513

514514

515515
The standard error estimates for $\hat{\beta}_0$ and $\hat{\beta}_1$
516-
obtained using the formulas from Section~\ref{Ch3:secoefsec} are
516+
obtained using the formulas from Section 3.1.2 are
517517
0.717 for the
518518
intercept and
519519
0.006 for the
520520
slope. Interestingly, these are somewhat different from the estimates
521521
obtained using the bootstrap. Does this indicate a problem with the
522522
bootstrap? In fact, it suggests the opposite. Recall that the
523523
standard formulas given in
524-
{Equation~\ref{Ch3:se.eqn} on page~\pageref{Ch3:se.eqn}}
524+
{Equation 3.8 on page 75}
525525
rely on certain assumptions. For example,
526526
they depend on the unknown parameter $\sigma^2$, the noise
527527
variance. We then estimate $\sigma^2$ using the RSS. Now although the
528528
formulas for the standard errors do not rely on the linear model being
529529
correct, the estimate for $\sigma^2$ does. We see
530-
{in Figure~\ref{Ch3:polyplot} on page~\pageref{Ch3:polyplot}} that there is
530+
{in Figure 3.8 on page 99} that there is
531531
a non-linear relationship in the data, and so the residuals from a
532532
linear fit will be inflated, and so will $\hat{\sigma}^2$. Secondly,
533533
the standard formulas assume (somewhat unrealistically) that the $x_i$
@@ -540,7 +540,7 @@ the results from `sm.OLS`.
540540
Below we compute the bootstrap standard error estimates and the
541541
standard linear regression estimates that result from fitting the
542542
quadratic model to the data. Since this model provides a good fit to
543-
the data (Figure~\ref{Ch3:polyplot}), there is now a better
543+
the data (Figure 3.8), there is now a better
544544
correspondence between the bootstrap estimates and the standard
545545
estimates of ${\rm SE}(\hat{\beta}_0)$, ${\rm SE}(\hat{\beta}_1)$ and
546546
${\rm SE}(\hat{\beta}_2)$.

0 commit comments

Comments
 (0)