forked from UofTChem-Teaching/excel4chem
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathexcel4chem.tex
More file actions
541 lines (434 loc) · 34.1 KB
/
excel4chem.tex
File metadata and controls
541 lines (434 loc) · 34.1 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
% Options for packages loaded elsewhere
\PassOptionsToPackage{unicode}{hyperref}
\PassOptionsToPackage{hyphens}{url}
%
\documentclass[
]{book}
\usepackage{amsmath,amssymb}
\usepackage{iftex}
\ifPDFTeX
\usepackage[T1]{fontenc}
\usepackage[utf8]{inputenc}
\usepackage{textcomp} % provide euro and other symbols
\else % if luatex or xetex
\usepackage{unicode-math} % this also loads fontspec
\defaultfontfeatures{Scale=MatchLowercase}
\defaultfontfeatures[\rmfamily]{Ligatures=TeX,Scale=1}
\fi
\usepackage{lmodern}
\ifPDFTeX\else
% xetex/luatex font selection
\fi
% Use upquote if available, for straight quotes in verbatim environments
\IfFileExists{upquote.sty}{\usepackage{upquote}}{}
\IfFileExists{microtype.sty}{% use microtype if available
\usepackage[]{microtype}
\UseMicrotypeSet[protrusion]{basicmath} % disable protrusion for tt fonts
}{}
\makeatletter
\@ifundefined{KOMAClassName}{% if non-KOMA class
\IfFileExists{parskip.sty}{%
\usepackage{parskip}
}{% else
\setlength{\parindent}{0pt}
\setlength{\parskip}{6pt plus 2pt minus 1pt}}
}{% if KOMA class
\KOMAoptions{parskip=half}}
\makeatother
\usepackage{xcolor}
\usepackage{color}
\usepackage{fancyvrb}
\newcommand{\VerbBar}{|}
\newcommand{\VERB}{\Verb[commandchars=\\\{\}]}
\DefineVerbatimEnvironment{Highlighting}{Verbatim}{commandchars=\\\{\}}
% Add ',fontsize=\small' for more characters per line
\usepackage{framed}
\definecolor{shadecolor}{RGB}{248,248,248}
\newenvironment{Shaded}{\begin{snugshade}}{\end{snugshade}}
\newcommand{\AlertTok}[1]{\textcolor[rgb]{0.94,0.16,0.16}{#1}}
\newcommand{\AnnotationTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{\textbf{\textit{#1}}}}
\newcommand{\AttributeTok}[1]{\textcolor[rgb]{0.13,0.29,0.53}{#1}}
\newcommand{\BaseNTok}[1]{\textcolor[rgb]{0.00,0.00,0.81}{#1}}
\newcommand{\BuiltInTok}[1]{#1}
\newcommand{\CharTok}[1]{\textcolor[rgb]{0.31,0.60,0.02}{#1}}
\newcommand{\CommentTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{\textit{#1}}}
\newcommand{\CommentVarTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{\textbf{\textit{#1}}}}
\newcommand{\ConstantTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{#1}}
\newcommand{\ControlFlowTok}[1]{\textcolor[rgb]{0.13,0.29,0.53}{\textbf{#1}}}
\newcommand{\DataTypeTok}[1]{\textcolor[rgb]{0.13,0.29,0.53}{#1}}
\newcommand{\DecValTok}[1]{\textcolor[rgb]{0.00,0.00,0.81}{#1}}
\newcommand{\DocumentationTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{\textbf{\textit{#1}}}}
\newcommand{\ErrorTok}[1]{\textcolor[rgb]{0.64,0.00,0.00}{\textbf{#1}}}
\newcommand{\ExtensionTok}[1]{#1}
\newcommand{\FloatTok}[1]{\textcolor[rgb]{0.00,0.00,0.81}{#1}}
\newcommand{\FunctionTok}[1]{\textcolor[rgb]{0.13,0.29,0.53}{\textbf{#1}}}
\newcommand{\ImportTok}[1]{#1}
\newcommand{\InformationTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{\textbf{\textit{#1}}}}
\newcommand{\KeywordTok}[1]{\textcolor[rgb]{0.13,0.29,0.53}{\textbf{#1}}}
\newcommand{\NormalTok}[1]{#1}
\newcommand{\OperatorTok}[1]{\textcolor[rgb]{0.81,0.36,0.00}{\textbf{#1}}}
\newcommand{\OtherTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{#1}}
\newcommand{\PreprocessorTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{\textit{#1}}}
\newcommand{\RegionMarkerTok}[1]{#1}
\newcommand{\SpecialCharTok}[1]{\textcolor[rgb]{0.81,0.36,0.00}{\textbf{#1}}}
\newcommand{\SpecialStringTok}[1]{\textcolor[rgb]{0.31,0.60,0.02}{#1}}
\newcommand{\StringTok}[1]{\textcolor[rgb]{0.31,0.60,0.02}{#1}}
\newcommand{\VariableTok}[1]{\textcolor[rgb]{0.00,0.00,0.00}{#1}}
\newcommand{\VerbatimStringTok}[1]{\textcolor[rgb]{0.31,0.60,0.02}{#1}}
\newcommand{\WarningTok}[1]{\textcolor[rgb]{0.56,0.35,0.01}{\textbf{\textit{#1}}}}
\usepackage{longtable,booktabs,array}
\usepackage{calc} % for calculating minipage widths
% Correct order of tables after \paragraph or \subparagraph
\usepackage{etoolbox}
\makeatletter
\patchcmd\longtable{\par}{\if@noskipsec\mbox{}\fi\par}{}{}
\makeatother
% Allow footnotes in longtable head/foot
\IfFileExists{footnotehyper.sty}{\usepackage{footnotehyper}}{\usepackage{footnote}}
\makesavenoteenv{longtable}
\usepackage{graphicx}
\makeatletter
\def\maxwidth{\ifdim\Gin@nat@width>\linewidth\linewidth\else\Gin@nat@width\fi}
\def\maxheight{\ifdim\Gin@nat@height>\textheight\textheight\else\Gin@nat@height\fi}
\makeatother
% Scale images if necessary, so that they will not overflow the page
% margins by default, and it is still possible to overwrite the defaults
% using explicit options in \includegraphics[width, height, ...]{}
\setkeys{Gin}{width=\maxwidth,height=\maxheight,keepaspectratio}
% Set default figure placement to htbp
\makeatletter
\def\fps@figure{htbp}
\makeatother
\setlength{\emergencystretch}{3em} % prevent overfull lines
\providecommand{\tightlist}{%
\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}
\setcounter{secnumdepth}{5}
\usepackage{booktabs}
\usepackage{amsthm}
\makeatletter
\def\thm@space@setup{%
\thm@preskip=8pt plus 2pt minus 4pt
\thm@postskip=\thm@preskip
}
\makeatother
\ifLuaTeX
\usepackage{selnolig} % disable illegal ligatures
\fi
\usepackage[]{natbib}
\bibliographystyle{plainnat}
\IfFileExists{bookmark.sty}{\usepackage{bookmark}}{\usepackage{hyperref}}
\IfFileExists{xurl.sty}{\usepackage{xurl}}{} % add URL line breaks if available
\urlstyle{same}
\hypersetup{
pdftitle={Excel for General Chemistry},
pdfauthor={David Hall, David Liu, and Jessica D'eon},
hidelinks,
pdfcreator={LaTeX via pandoc}}
\title{Excel for General Chemistry}
\author{David Hall, David Liu, and Jessica D'eon}
\date{Book last built on 2023-06-23}
\begin{document}
\maketitle
{
\setcounter{tocdepth}{1}
\tableofcontents
}
\hypertarget{welcome}{%
\chapter*{Welcome!}\label{welcome}}
\addcontentsline{toc}{chapter}{Welcome!}
Placeholder
\hypertarget{providing-feedback}{%
\section*{Providing Feedback}\label{providing-feedback}}
\addcontentsline{toc}{section}{Providing Feedback}
\hypertarget{intro}{%
\chapter{Introduction}\label{intro}}
Placeholder
\hypertarget{ozone-is-not-our-friend}{%
\section{Ozone is Not Our Friend}\label{ozone-is-not-our-friend}}
\hypertarget{atmospheric-concentration-units}{%
\section{Atmospheric Concentration Units}\label{atmospheric-concentration-units}}
\hypertarget{capabilities}{%
\chapter{Getting Setup for Success}\label{capabilities}}
Placeholder
\hypertarget{accessing-the-microsoft-office-suite-of-software}{%
\section{Accessing the Microsoft Office Suite of Software}\label{accessing-the-microsoft-office-suite-of-software}}
\hypertarget{managing-your-files}{%
\section{Managing your Files}\label{managing-your-files}}
\hypertarget{creating-professional-documents}{%
\section{Creating Professional Documents}\label{creating-professional-documents}}
\hypertarget{adding-an-excel-graph-to-a-word-document}{%
\subsection{Adding an Excel Graph to a Word Document}\label{adding-an-excel-graph-to-a-word-document}}
\hypertarget{adding-images-to-a-word-document}{%
\subsection{Adding Images to a Word Document}\label{adding-images-to-a-word-document}}
\hypertarget{creating-a-pdf-from-a-word-document}{%
\subsection{Creating a PDF from a Word Document}\label{creating-a-pdf-from-a-word-document}}
\hypertarget{data-wrangling}{%
\chapter{Data Wrangling}\label{data-wrangling}}
Placeholder
\hypertarget{opening-a-csv-file}{%
\section{Opening a CSV file}\label{opening-a-csv-file}}
\hypertarget{practice-csv}{%
\subsection{Practice CSV}\label{practice-csv}}
\hypertarget{data-discovery-and-cell-formatting}{%
\section{Data Discovery and Cell Formatting}\label{data-discovery-and-cell-formatting}}
\hypertarget{data-cleaning}{%
\section{Data Cleaning}\label{data-cleaning}}
\hypertarget{math-stats-and-programming}{%
\chapter{Math, Stats, and Programming}\label{math-stats-and-programming}}
Physical chemistry is a quantitative science, meaning we use mathematical operations to explain and explore chemical phenomena. This math can be done by hand, but once you have more than one or two data points this gets very tiring and a spreadsheet can be a lifesaver! In this section we will talk about how to program mathematical equations into Excel, how to use common mathematical and statistical functions, and how to reference other values within the spreadsheet. We will also dip our toes into some simple computer programming that can be very useful.
The first two subsection \protect\hyperlink{mathematical-operations}{Mathematical Operations} and \protect\hyperlink{cell-referencing}{Cell Referencing} use the air quality datasets from the NAPS network as examples. If you received a dataset through your CHM135 lab section feel free to use it to follow along. If not, you can visit the \protect\hyperlink{intro}{Introduction} page to download the example dataset used throughout this resource (note that you cannot use the example dataset for your analysis in CHM135).
\hypertarget{mathematical-operations}{%
\section{Mathematical Operations}\label{mathematical-operations}}
As described in the \protect\hyperlink{intro}{Introduction}, atmospheric chemists have defined the term ``odd oxygen'' or O\textsubscript{X} as the sum of the concentrations of NO\textsubscript{2} and O\textsubscript{3}. \textbf{To add O\textsubscript{X} to your analysis you will need to calculate the concentration O\textsubscript{X} at each timepoint}. To do this, simply add the cells containing O\textsubscript{3} and NO\textsubscript{2} for the first timepoint (both concentrations are in ppb so they can be added directly) then copy this formula down the column. When you add a formula into Excel remember to start with an equals sign (``=''), this will let Excel know you are writing a formula. If you are referencing a cell you can either write it in (e.g.~``C2'') or you can click on the cell. Once your formula is complete, hit ENTER and Excel will perform the calculation.
Excel can perform many mathematical operations and the table below provides a list of some of the most common with their corresponding formula.
\begin{longtable}[]{@{}
>{\raggedright\arraybackslash}p{(\columnwidth - 8\tabcolsep) * \real{0.1908}}
>{\raggedright\arraybackslash}p{(\columnwidth - 8\tabcolsep) * \real{0.4733}}
>{\raggedright\arraybackslash}p{(\columnwidth - 8\tabcolsep) * \real{0.1603}}
>{\raggedright\arraybackslash}p{(\columnwidth - 8\tabcolsep) * \real{0.1527}}
>{\raggedright\arraybackslash}p{(\columnwidth - 8\tabcolsep) * \real{0.0229}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\raggedright
Input
\end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
Operator
\end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
Example\ldots{}
\end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
\ldots{} solves to
\end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
= & Equal sign indicates a mathematical operation & & & \\
+ & Addition & =2+2 & 4 & \\
- & Subtraction & =3--2 & 1 & \\
* & Multiplication & =3*2 & 6 & \\
/ & Division & =4/2 & 2 & \\
\^{} & Exponent & =4\^{}2 & 16 & \\
( ) & Brackets to indicate order of operations & =(4+2)*2 & 12 & \\
LOG(number,base) & Logarithm (10 is the default if no base is indicated) & =LOG(100,10) & 2 & \\
LN(number) & Natural Logarithm & =LN(2) & 0.693\ldots{} & \\
\end{longtable}
To copy the formula from one cell down the list of data points, you can use the copy and paste function in the edit menu or the keyboard shortcuts CONTROL+C to copy and CONTROL+V to paste in Windows or COMMAND+C and COMMAND+V on a Mac. You can also use a shortcut specific to Excel. Click on the cell with your formula and notice the little square in the lower righthand corner. Click on this square and drag the formula down the column, or, even better, double click on the square and Excel will automatically paste the formula down the column until the end of the series (note that it will stop at the first blank cell so if you are missing values you may need to copy and paste the formula the rest of the way). \textbf{When you add a new value, make sure you remember to label your column.}
\begin{Shaded}
\begin{Highlighting}[]
\NormalTok{knitr}\SpecialCharTok{::}\FunctionTok{include\_graphics}\NormalTok{(}\AttributeTok{path=}\StringTok{"./gifs/OxCalc.gif"}\NormalTok{)}
\end{Highlighting}
\end{Shaded}
\includegraphics{./gifs/OxCalc.gif}
If you are learning about mathematical operations in Excel for the first time we strongly suggest reading the next section on cell referencing.
\hypertarget{cell-referencing}{%
\section{Cell Referencing}\label{cell-referencing}}
One of Excel's strengths is its ability to easily perform mathematical operations in larger datasets. To take advantage of this feature it is important to appreciate how Excel references values in the spreadsheet. Simply referencing a cell in Excel will add a \textbf{relative reference}, which means that the reference will move as the equation is copied and pasted to other cells. This is the type of reference created in the O\textsubscript{X} calculation in the section above. If you are unsure of the cells being referenced in an equation Excel will visualize the reference when you click on the cell. To exit the function hit escape (ESC) on your keyboard.
\begin{Shaded}
\begin{Highlighting}[]
\NormalTok{knitr}\SpecialCharTok{::}\FunctionTok{include\_graphics}\NormalTok{(}\AttributeTok{path=}\StringTok{"./gifs/RelRef.gif"}\NormalTok{)}
\end{Highlighting}
\end{Shaded}
\includegraphics{./gifs/RelRef.gif}
A relative reference is very useful when calculating a series of numbers, however there are times when you might wish to make an \textbf{absolute reference}, such that it does not move as it is copied and pasted. To do this you add a \$ in front of the letter and/or the number in the cell reference. It is possible to add a \$ to only the letter (column reference) or the number (row reference) and only lock one aspect of the referencing or to lock both.
The gif below, which is a spreadsheet setup to calculate the energy of a photon in J/molecule from its wavelength in nm (if interested, download the example dataset \href{./datasets/Relative_Referencing_Exercise.csv}{here}), is an example that includes both absolute and relative references. The equation to calculate energy includes absolute references to the relevant constants and a relative reference to the wavelength of the photon such that it moves as the equation is copied down the column. As shown in the gif, if the absolute reference is not properly applied the calculated values don't make any sense. A more complicated example of cell referencing is discussed in the \protect\hyperlink{if-statements}{If Statements} section below.
\begin{Shaded}
\begin{Highlighting}[]
\NormalTok{knitr}\SpecialCharTok{::}\FunctionTok{include\_graphics}\NormalTok{(}\AttributeTok{path=}\StringTok{"./gifs/EnergyCalc.gif"}\NormalTok{)}
\end{Highlighting}
\end{Shaded}
\includegraphics{./gifs/EnergyCalc.gif}
\hypertarget{spreadsheet-organization}{%
\section{Spreadsheet Organization}\label{spreadsheet-organization}}
Before moving on to more complicated functions in Excel \textbf{it is useful to discuss ways of organizing a spreadsheet}, particularly when planning to use cell references. In the example above, the constants were added to the top of the sheet so they could be easily incorporated into the calculation. A limitation of this organization is that it's not possible to select entire columns using the letter column headers. \textbf{If you decide to move your constants around, using the CUT and PASTE functions will preserve the cell reference}. Cells can be CUT in Excel by selecting CUT from the dropdown menu that appears when you right click or using the keyboard shortcuts, CONTROL+X in Windows and COMMAND+X on a Mac. You can also add rows or columns to your sheet by selecting INSERT from the dropdown menu when you right click. The number of rows or columns added correspond to the number highlighted and the rows will be added above your selection and the columns to the left of your selection.
\begin{Shaded}
\begin{Highlighting}[]
\NormalTok{knitr}\SpecialCharTok{::}\FunctionTok{include\_graphics}\NormalTok{(}\AttributeTok{path=}\StringTok{"./gifs/CutPaste.gif"}\NormalTok{)}
\end{Highlighting}
\end{Shaded}
\includegraphics{./gifs/CutPaste.gif}
There is no one right way to organize a spreadsheet, but when using values that need to be referenced or values that are calculated it is useful to plan their position in the sheet and to be able to move them if your original position isn't optimal.
\hypertarget{simple-statistical-functions}{%
\section{Simple Statistical Functions}\label{simple-statistical-functions}}
When we have several measurements it is often useful to know how they compare to one another. To get a sense of that we can employ some simple statistical functions. One such function is a mean or average value, given the symbol \(\overline{x}\). Another typical function is a standard deviation, given the symbol \(\sigma\). We typically understand what a mean value represents, but what is a standard deviation? The mathematical formula for the standard deviation of a set of values is shown in equation (\ref{eq8}):
\[
\begin{align}
\sigma = \sqrt{ \frac{\sum (x_i-\overline{x})^{2}}{N-1}} \label{eq8}\tag{8}
\end{align}
\]
where \(x_i\) is the individual data points within the dataset; \(\overline{x}\) is the mean value of the dataset; and N is the number of values in the dataset. Looking at this formula it is clear that standard deviation is related to the difference between the individual values in the dataset and the mean value of the dataset. Standard deviation is a useful metric for appreciating the spread in the data. The greater the spread, the greater the standard deviation. More specifically, as shown in the figure below, one standard deviation away from the mean on either side will include 68\% of the data points and two standard deviations will include 95\% of the data.
\includegraphics{images/StandardDeviation.png}
Standard deviation is a useful metric. The magnitude of the spread in the data is often related to the overall magnitude of the values. For example, the standard deviation of 0.5 is large for a dataset with a mean value of 1, but small for a dataset with a mean value of 50. To provide this context the relative standard deviation (RSD) can be calculated by dividing the standard deviation by the mean value and then multiplying this ratio by 100 (equation (\ref{eq9}).
\[
\begin{align}
RSD = \frac{\sigma}{\overline{x}} \times 100 \label{eq9}\tag{9}
\end{align}
\]
The RSD of a dataset expresses the magnitude of the spread in the data using a percentage to normalize for differences in magnitude (e.g.~a dataset with a mean value of 1 and a standard deviation of 0.5 has an RSD of 50\%, and that with a mean value of 50 and a standard deviation of 0.5 has an RSD of 1\%).
The table below presents formula and descriptions to calculate the mean, median, standard deviation, as well as the minimum and maximum values for a set of numbers.
\begin{longtable}[]{@{}
>{\raggedright\arraybackslash}p{(\columnwidth - 8\tabcolsep) * \real{0.2523}}
>{\raggedright\arraybackslash}p{(\columnwidth - 8\tabcolsep) * \real{0.2890}}
>{\raggedright\arraybackslash}p{(\columnwidth - 8\tabcolsep) * \real{0.4312}}
>{\raggedright\arraybackslash}p{(\columnwidth - 8\tabcolsep) * \real{0.0138}}
>{\raggedright\arraybackslash}p{(\columnwidth - 8\tabcolsep) * \real{0.0138}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\raggedright
Calculation
\end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
Formula
\end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
Description
\end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
\end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
Mean & =AVERAGE(number1, {[}number2{]}, \ldots) & Returns the arithmetic mean of the inputs. & & \\
Median & =MEDIAN(number1, {[}number2{]}, \ldots) & Returns the middle number in a group of supplied number. & & \\
Standard Deviation & =STDEV(number1, {[}number2{]}, \ldots) & Returns the standard deviation of the samples provided. & & \\
Minimum & =MIN(number1, {[}number2{]}, \ldots) & Returns the minimum value in a set of numbers. & & \\
Maximum & =MAX(number1, {[}number2{]}, \ldots) & Returns the maximum value in a set of numbers. & & \\
\end{longtable}
The figure below provides some context on how to use these formulas with an example calculating the mean, median, standard deviation, relative standard deviation, minimum and maximum values of a list of outdoor summer temperatures in Toronto (if interested, download the example dataset \href{./datasets/Temperature.csv}{here}). The formula for each calculation is provided as text next to the value in question (note the quotation marks, which indicate to Excel that the contents of the cell should be treated as text). Keep in mind that Excel will help you with these calculations, once you put an equal sign into the equation bar and start typing it will give you suggestions of possible functions, don't be afraid to use them. Once you select the function and the brackets are open you can also select one cell, or a series of cells, using the cursor as opposed to writing them in.
\includegraphics{images/SimpleStats.png}
\hypertarget{sort-function}{%
\section{Sort Function}\label{sort-function}}
One very useful aspect of Excel is its ability to sort tables and list according to defined characteristics, typically alphabetical order or in ascending or descending numerical order. To sort data in a table first highlight all of the relevant rows and columns, then select ``Sort'' from the ``Data'' menu or select Data \textgreater{} Sort from the top menu. A dialog box will pop up, the first thing to do is to tell Excel whether the highlighted cells include your column headers, in the example below (which uses data from CHM135 Lab 2) the table does include headers and so the ``My List has Headers'' box on the top right was clicked. If your dataset doesn't include headers then you will choose the data in a column as defined by its letter in the Excel sheet. Whether you have headers or not, the next step is to select the column you will use to sort your data, in this case ``Unknown Sample Letter'' and then select ``A to Z'' in the ``Order'' input box. The sort function then organizes the entire table such that the column selected is in alphabetical order.
\begin{Shaded}
\begin{Highlighting}[]
\NormalTok{knitr}\SpecialCharTok{::}\FunctionTok{include\_graphics}\NormalTok{(}\AttributeTok{path=}\StringTok{"./gifs/sort.gif"}\NormalTok{)}
\end{Highlighting}
\end{Shaded}
\includegraphics{./gifs/sort.gif}
\hypertarget{if-statements}{%
\section{If Statements}\label{if-statements}}
Now that you have used some simple statistical functions in Excel, let's explore some other functions and simple computer programming. The first function we will explore is the use of ``if'' statements. ``if'' is a conditional statement that signals to Excel that a decision will need to be made. As with the functions described in the \protect\hyperlink{simple-statistical-functions}{Simple Statistical Functions} section, once you start the formula by writing ``=i'' into the formula bar Excel will provide you with many possible functions. Choose ``IF''.
\includegraphics[width=0.4\textwidth,height=\textheight]{images/ifstatement1.png}
Once you choose the IF function, Excel will again help you by explaining what the function requires to operate, in this case a logical test and a result for when the test is true and for when it is false. As is typical in more sophisticated Excel programming, each of these requirements is separated by commas.
\includegraphics[width=0.5\textwidth,height=\textheight]{images/ifstatement2.png}
To see how the IF function can be applied, let's imagine a scenario where a small cafe has four employees and needs to sort out the hours they have worked (if interested, download the example dataset \href{./datasets/If_Statements.csv}{here}). In this case the timekeeping software exports a ledger that indicates the employee, day, time, and number of hours they worked.
\includegraphics{images/employees.png}
The cafe employer would like to sort this information so that they have a clear picture of the number of hours worked by each employee over the course of the week. To do this they create a column header that includes the employee's names (Ayaan, Charlotte, Melena and Zambala) and then write IF statements for each that only add the number of hours into the column if the employee's name is found in that row. As described above the IF always includes:
\begin{itemize}
\item
\textbf{Logical Test:} in this case, the logical test is asking whether the name in a given row matches the name at the top of the column (note the use of a relative reference for the name in the row and an absolute reference for the column header, details on referencing can be found in the \protect\hyperlink{cell-referencing}{Cell Referencing} section).
\item
\textbf{Value If True:} here the formula selects the number of hours for the row being investigated as this is what we would like to display if that employee was the one to work these hours.
\item
\textbf{Value If False:} here the formula indicates the conditional statement should report a empty cell (indicated by the empty brackets ``\,``) if the employee mentioned in this row does not match the name at the top of the column.
\end{itemize}
\begin{Shaded}
\begin{Highlighting}[]
\NormalTok{knitr}\SpecialCharTok{::}\FunctionTok{include\_graphics}\NormalTok{(}\AttributeTok{path=}\StringTok{"./gifs/ifstatements.gif"}\NormalTok{)}
\end{Highlighting}
\end{Shaded}
\includegraphics{./gifs/ifstatements.gif}
To copy this first IF statement to the remaining employees we could try to simply copy and paste the cell itself to the next column, but (as shown in the gif below, or if you try it yourself) this would move all the relative references and cause a mess. To address this we need to rethink the relative and absolute references used in the gif above. \textbf{Before you go through the changes made below, challenge yourself to write out the changes you think would work by hand or work through it using the dataset provided.}
Here are the changes that need to be implemented so that the function can be copied across both columns and rows:
\begin{itemize}
\item
\textbf{Logical Test:} The reference to the employee name in column A needs an absolute reference on the column, so it doesn't more as you copy the formula to the other columns, but a relative reference to the row so it moves as the formula is copy down the rows. Conversely the reference to the names in the column headers needs a relative reference on the column but an absolute reference on the row.
\item
\textbf{Value If True:} Similar to employee name in column A, hours worked in column E need an absolute reference on the column but a relative reference on the row.
\item
\textbf{Value If False:} No cell references here.
\end{itemize}
\begin{Shaded}
\begin{Highlighting}[]
\NormalTok{knitr}\SpecialCharTok{::}\FunctionTok{include\_graphics}\NormalTok{(}\AttributeTok{path=}\StringTok{"./gifs/ifstatements2.gif"}\NormalTok{)}
\end{Highlighting}
\end{Shaded}
\includegraphics{./gifs/ifstatements2.gif}
\hypertarget{sum-and-count-functions}{%
\section{SUM and COUNT Functions}\label{sum-and-count-functions}}
Excel includes many useful functions, far too many to outline here. A couple of very useful functions are the the SUM and COUNT functions. As the name suggests, the SUM function returns the sum of a range of numbers. The COUNT function returns the number of values in the outlined range. Both of these functions can be combined with the IF function using the SUMIF and COUNTIF functions where values are summed or counted only if certain criteria are met. The AVERAGE function introduced in the \protect\hyperlink{simple-statistical-functions}{Simple Statistical Functions} section can also be included with an IF statement using the AVERAGEIF function.
\begin{longtable}[]{@{}
>{\raggedright\arraybackslash}p{(\columnwidth - 4\tabcolsep) * \real{0.1189}}
>{\raggedright\arraybackslash}p{(\columnwidth - 4\tabcolsep) * \real{0.3217}}
>{\raggedright\arraybackslash}p{(\columnwidth - 4\tabcolsep) * \real{0.5594}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\raggedright
\textbf{Calculation}
\end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
\textbf{Formula}
\end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
\textbf{Description}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
Sum & =SUM(number1, {[}number2{]}, \ldots) & Returns the sum of the inputs \\
Sum If & =SUMIF(range, criteria, {[}sum\_range{]}) & Returns the sum of the inputs based on the outlined criteria \\
Count & =COUNT(value1, {[}value2{]}, \ldots) & Returns the number of values in the given range \\
Count If & =COUNTIF(range, criteria) & Returns the number of values in the given range based on the outlined criteria \\
Mean If & =AVERAGEIF(range, criteria, {[}average\_range{]}) & Returns the arithmetic mean of the inputs based on the outlined criteria \\
\end{longtable}
To explore these functions let's take another look at the book-keeping exercise from the \protect\hyperlink{if-statements}{If Statements} section. Imagine the hours for each employee need to be summed over the week. This can be accomplished using the SUM function. The number of shifts worked can also be determined using the COUNT function, and the average length of the shifts using the AVERAGE function.
\begin{Shaded}
\begin{Highlighting}[]
\NormalTok{knitr}\SpecialCharTok{::}\FunctionTok{include\_graphics}\NormalTok{(}\AttributeTok{path=}\StringTok{"./gifs/SumCount.gif"}\NormalTok{)}
\end{Highlighting}
\end{Shaded}
\includegraphics{./gifs/SumCount.gif}
The work flow outlined above is a two-step process that first uses an IF statement and then the SUM, COUNT and AVERAGE functions. This works perfectly well. The two functions can also be combined using the SUMIF, COUNTIF and AVERAGEIF functions. To accomplish this, select the range of values that will be used to judge the criteria, the criteria itself, and then the range to be summed, counted or averaged.
\begin{Shaded}
\begin{Highlighting}[]
\NormalTok{knitr}\SpecialCharTok{::}\FunctionTok{include\_graphics}\NormalTok{(}\AttributeTok{path=}\StringTok{"./gifs/SumCountif.gif"}\NormalTok{)}
\end{Highlighting}
\end{Shaded}
\includegraphics{./gifs/SumCountif.gif}
\hypertarget{slope-intercept-and-r2-functions}{%
\section{\texorpdfstring{SLOPE, INTERCEPT and R\textsuperscript{2} Functions}{SLOPE, INTERCEPT and R2 Functions}}\label{slope-intercept-and-r2-functions}}
The relationship between many variables in chemistry can be explained using a linear regression line with equation y = mx + b. Excel can add linear trendlines with equations and R\textsuperscript{2} values to scatterplots. Details on how to do, as well as a discussion of what a linear regression entails, are provided in the \protect\hyperlink{adding-a-linear-regression-line}{Adding a Linear Regression Line} subsection of the \protect\hyperlink{data-visualization}{Data visualization} section of this resource. \textbf{If the equation of the linear trendline is required for subsequent calculations it is useful to calculate the slope (m) and intercept (b) in the body of the spreadsheet} as opposed to simply reading them off of the graph. The functions used to do perform these calculations are provided in the table below.
\begin{longtable}[]{@{}
>{\raggedright\arraybackslash}p{(\columnwidth - 4\tabcolsep) * \real{0.1189}}
>{\raggedright\arraybackslash}p{(\columnwidth - 4\tabcolsep) * \real{0.3217}}
>{\raggedright\arraybackslash}p{(\columnwidth - 4\tabcolsep) * \real{0.5594}}@{}}
\toprule\noalign{}
\begin{minipage}[b]{\linewidth}\raggedright
\textbf{Calculation}
\end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
\textbf{Formula}
\end{minipage} & \begin{minipage}[b]{\linewidth}\raggedright
\textbf{Description}
\end{minipage} \\
\midrule\noalign{}
\endhead
\bottomrule\noalign{}
\endlastfoot
Slope & =SLOPE(known\_ys, known\_xs) & Returns the slope of the linear trendline \\
Intercept & =INTERCEPT(known\_ys, known\_xs) & Returns the intercept of the linear trendline \\
R\textsuperscript{2} & =RSQ(known\_ys, known\_xs) & Returns the R\textsuperscript{2} of the linear regression, which is a measure of how well the line predicts the variation observed in the data \\
\end{longtable}
The GIF below provides an example executing these functions using the dataset introduced in the \protect\hyperlink{adding-a-linear-regression-line}{Adding a Linear Regression Line} subsection that compares the number of computer science PhDs graduating in the US with US arcade revenues each year. If we assume the number of US computer science PhDs graduating in a given year is a good predictor of US arcade revenues (which we certainly do not know to be the case, but just humor me) then we could use the equation of the trendline to predict US arcade revenues in a year when the US graduated 1300 computer science PhDs.
\begin{Shaded}
\begin{Highlighting}[]
\NormalTok{knitr}\SpecialCharTok{::}\FunctionTok{include\_graphics}\NormalTok{(}\AttributeTok{path=}\StringTok{"./gifs/LinearRegression.gif"}\NormalTok{)}
\end{Highlighting}
\end{Shaded}
\includegraphics{./gifs/LinearRegression.gif}
Although there is no clear utility to the data used in this example, it is clear that having the slope and intercept of a linear equation available in the body of the Excel sheet can facilitate subsequent calculations.
\hypertarget{data-visualization}{%
\chapter{Data Visualization}\label{data-visualization}}
Placeholder
\hypertarget{creating-a-scatterplot}{%
\section{Creating a Scatterplot}\label{creating-a-scatterplot}}
\hypertarget{changing-chart-type-or-design}{%
\section{Changing Chart Type or Design}\label{changing-chart-type-or-design}}
\hypertarget{adding-axis-labels-and-a-chart-title}{%
\section{Adding Axis Labels and a Chart Title}\label{adding-axis-labels-and-a-chart-title}}
\hypertarget{notes-on-chart-titles}{%
\subsection{Notes on Chart Titles}\label{notes-on-chart-titles}}
\hypertarget{formatting-axis-labels}{%
\section{Formatting Axis Labels}\label{formatting-axis-labels}}
\hypertarget{adding-a-linear-regression-line}{%
\section{Adding a Linear Regression Line}\label{adding-a-linear-regression-line}}
\hypertarget{adding-a-dataseries-to-an-existing-plot}{%
\section{Adding a Dataseries to an Existing Plot}\label{adding-a-dataseries-to-an-existing-plot}}
\hypertarget{annotating-a-plot}{%
\section{Annotating a Plot}\label{annotating-a-plot}}
\bibliography{book.bib,packages.bib}
\end{document}