Skip to content

Commit baf8fe2

Browse files
committed
markdown source builds
Auto-generated via `{sandpaper}` Source : cd7dbe7 Branch : main Author : Naupaka Zimmerman <[email protected]> Time : 2025-01-07 18:05:01 +0000 Message : Merge pull request #908 from caseyyoungflesh/04-fix_feline-data_v2.csv Add data in place of `feline-data_v2.csv`, closes #717
1 parent 812c5aa commit baf8fe2

File tree

3 files changed

+40
-54
lines changed

3 files changed

+40
-54
lines changed

04-data-structures-part1.md

Lines changed: 38 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -237,30 +237,38 @@ No matter how
237237
complicated our analyses become, all data in R is interpreted as one of these
238238
basic data types. This strictness has some really important consequences.
239239

240-
A user has added details of another cat. This information is in the file
241-
`data/feline-data_v2.csv`.
240+
A user has provided details of another cat. We can add an additional row to our cats table using `rbind()`.
242241

243242

244243
``` r
245-
file.show("data/feline-data_v2.csv")
244+
additional_cat <- data.frame(coat = "tabby", weight = "2.3 or 2.4", likes_catnip = 1)
245+
additional_cat
246246
```
247247

248+
``` output
249+
coat weight likes_catnip
250+
1 tabby 2.3 or 2.4 1
251+
```
248252

249253
``` r
250-
coat,weight,likes_catnip
251-
calico,2.1,1
252-
black,5.0,0
253-
tabby,3.2,1
254-
tabby,2.3 or 2.4,1
254+
cats2 <- rbind(cats, additional_cat)
255+
cats2
256+
```
257+
258+
``` output
259+
coat weight likes_catnip
260+
1 calico 2.1 1
261+
2 black 5 0
262+
3 tabby 3.2 1
263+
4 tabby 2.3 or 2.4 1
255264
```
256265

257-
Load the new cats data like before, and check what type of data we find in the
258-
`weight` column:
266+
Let's check what type of data we find in the
267+
`weight` column of our new `cats2` object:
259268

260269

261270
``` r
262-
cats <- read.csv(file="data/feline-data_v2.csv")
263-
typeof(cats$weight)
271+
typeof(cats2$weight)
264272
```
265273

266274
``` output
@@ -272,18 +280,18 @@ we did on them before, we run into trouble:
272280

273281

274282
``` r
275-
cats$weight + 2
283+
cats2$weight + 2
276284
```
277285

278286
``` error
279-
Error in cats$weight + 2: non-numeric argument to binary operator
287+
Error in cats2$weight + 2: non-numeric argument to binary operator
280288
```
281289

282290
What happened?
283-
The `cats` data we are working with is something called a *data frame*. Data frames
291+
The `cats` (and `cats2`) data we are working with is something called a *data frame*. Data frames
284292
are one of the most common and versatile types of *data structures* we will work with in R.
285293
A given column in a data frame cannot be composed of different data types.
286-
In this case, R does not read everything in the data frame column `weight` as a *double*, therefore the entire
294+
In this case, R cannot store everything in the data frame column `weight` as a *double* anymore once we add the row for the additional cat (because its weight is `2.3 or 2.4`), therefore the entire
287295
column data type changes to something that is suitable for everything in the column.
288296

289297
When R reads a csv file, it reads it in as a *data frame*. Thus, when we loaded the `cats`
@@ -292,42 +300,22 @@ is written by the `str()` function:
292300

293301

294302
``` r
295-
str(cats)
303+
str(cats2)
296304
```
297305

298306
``` output
299307
'data.frame': 4 obs. of 3 variables:
300308
$ coat : chr "calico" "black" "tabby" "tabby"
301309
$ weight : chr "2.1" "5" "3.2" "2.3 or 2.4"
302-
$ likes_string: int 1 0 1 1
310+
$ likes_catnip: num 1 0 1 1
303311
```
304312

305313
*Data frames* are composed of rows and columns, where each column has the
306314
same number of rows. Different columns in a data frame can be made up of different
307315
data types (this is what makes them so versatile), but everything in a given
308316
column needs to be the same type (e.g., vector, factor, or list).
309317

310-
Let's explore more about different data structures and how they behave.
311-
For now, let's remove that extra line from our cats data and reload it,
312-
while we investigate this behavior further:
313-
314-
feline-data.csv:
315-
316-
```
317-
coat,weight,likes_catnip
318-
calico,2.1,1
319-
black,5.0,0
320-
tabby,3.2,1
321-
```
322-
323-
And back in RStudio:
324-
325-
326-
``` r
327-
cats <- read.csv(file="data/feline-data.csv")
328-
```
329-
330-
318+
Let's explore more about different data structures and how they behave. For now, we will focus on our original data frame `cats` (and we can forget about `cats2` for the rest of this episode).
331319

332320
### Vectors and Type Coercion
333321

@@ -555,8 +543,7 @@ Create a new script in RStudio and copy and paste the following code. Then
555543
move on to the tasks below, which help you to fill in the gaps (\_\_\_\_\_\_).
556544

557545
```
558-
# Read data
559-
cats <- read.csv("data/feline-data_v2.csv")
546+
Using the object `cats2`:
560547
561548
# 1. Print the data
562549
_____
@@ -568,15 +555,15 @@ _____(cats)
568555
# The correct data type is: ____________.
569556
570557
# 4. Correct the 4th weight data point with the mean of the two given values
571-
cats$weight[4] <- 2.35
558+
cats2$weight[4] <- 2.35
572559
# print the data again to see the effect
573560
cats
574561
575562
# 5. Convert the weight to the right data type
576-
cats$weight <- ______________(cats$weight)
563+
cats2$weight <- ______________(cats2$weight)
577564
578565
# Calculate the mean to test yourself
579-
mean(cats$weight)
566+
mean(cats2$weight)
580567
581568
# If you see the correct mean value (and not NA), you did the exercise
582569
# correctly!
@@ -586,7 +573,7 @@ mean(cats$weight)
586573

587574
#### 1\. Print the data
588575

589-
Execute the first statement (`read.csv(...)`). Then print the data to the
576+
Print the data to the
590577
console
591578

592579
::::::::::::::: solution
@@ -601,8 +588,8 @@ Show the content of any variable by typing its name.
601588
Two correct solutions:
602589

603590
```
604-
cats
605-
print(cats)
591+
cats2
592+
print(cats2)
606593
```
607594

608595
:::::::::::::::::::::::::
@@ -611,7 +598,7 @@ print(cats)
611598

612599
The data type of your data is as important as the data itself. Use a
613600
function we saw earlier to print out the data types of all columns of the
614-
`cats` table.
601+
`cats2` `data.frame`.
615602

616603
::::::::::::::: solution
617604

@@ -628,15 +615,14 @@ here.
628615
> ### Solution to Challenge 1.2
629616
>
630617
> ```
631-
> str(cats)
618+
> str(cats2)
632619
> ```
633620
634621
#### 3\. Which data type do we need?
635622
636623
The shown data type is not the right one for this data (weight of
637624
a cat). Which data type do we need?
638625
639-
- Why did the `read.csv()` function not choose the correct data type?
640626
- Fill in the gap in the comment with the correct data type for cat weight!
641627
642628
::::::::::::::: solution
@@ -715,8 +701,8 @@ auto-complete function: Type "`as.`" and then press the TAB key.
715701
> There are two functions that are synonymous for historic reasons:
716702
>
717703
> ```
718-
> cats$weight <- as.double(cats$weight)
719-
> cats$weight <- as.numeric(cats$weight)
704+
> cats2$weight <- as.double(cats2$weight)
705+
> cats2$weight <- as.numeric(cats2$weight)
720706
> ```
721707
722708
::::::::::::::::::::::::::::::::::::::::::::::::::

data/feline-data.csv

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
"coat","weight","likes_string"
1+
"coat","weight","likes_catnip"
22
"calico",2.1,1
33
"black",5,0
44
"tabby",3.2,1

md5sum.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
"episodes/01-rstudio-intro.Rmd" "04f6b758558750cef962768d78dd63b0" "site/built/01-rstudio-intro.md" "2024-12-03"
77
"episodes/02-project-intro.Rmd" "cd60cc3116d4f6be92f03f5cc51bcc3b" "site/built/02-project-intro.md" "2024-12-03"
88
"episodes/03-seeking-help.Rmd" "d24c310b8f36930e70379458f3c93461" "site/built/03-seeking-help.md" "2024-12-03"
9-
"episodes/04-data-structures-part1.Rmd" "afc6c3ced3677ab088457152f8d84b54" "site/built/04-data-structures-part1.md" "2024-12-03"
9+
"episodes/04-data-structures-part1.Rmd" "5e680e381a7d16228ee1ee2c9ec8a151" "site/built/04-data-structures-part1.md" "2025-01-07"
1010
"episodes/05-data-structures-part2.Rmd" "95c5dd30b8288090ce89ecbf2d3072bd" "site/built/05-data-structures-part2.md" "2024-12-03"
1111
"episodes/06-data-subsetting.Rmd" "5d4ce8731ab37ddea81874d63ae1ce86" "site/built/06-data-subsetting.md" "2024-12-03"
1212
"episodes/07-control-flow.Rmd" "6a8691c8668737e4202f49b52aeb8ac6" "site/built/07-control-flow.md" "2024-12-03"

0 commit comments

Comments
 (0)