markdown source builds

actions-user · actions-user · commit baf8fe20085d · 2025-01-07T18:06:14.000Z
Auto-generated via `{sandpaper}` Source : cd7dbe7 Branch : main Author : Naupaka Zimmerman <naupaka@gmail.com> Time : 2025-01-07 18:05:01 +0000 Message : Merge pull request #908 from caseyyoungflesh/04-fix_feline-data_v2.csv Add data in place of `feline-data_v2.csv`, closes #717
diff --git a/04-data-structures-part1.md b/04-data-structures-part1.md
@@ -237,30 +237,38 @@ No matter how
 complicated our analyses become, all data in R is interpreted as one of these
 basic data types. This strictness has some really important consequences.
 
-A user has added details of another cat. This information is in the file
-`data/feline-data_v2.csv`.
+A user has provided details of another cat. We can add an additional row to our cats table using `rbind()`.
 
 
 ``` r
-file.show("data/feline-data_v2.csv")
+additional_cat <- data.frame(coat = "tabby", weight = "2.3 or 2.4", likes_catnip = 1)
+additional_cat
 ```
 
+``` output
+   coat     weight likes_catnip
+1 tabby 2.3 or 2.4            1
+```
 
 ``` r
-coat,weight,likes_catnip
-calico,2.1,1
-black,5.0,0
-tabby,3.2,1
-tabby,2.3 or 2.4,1
+cats2 <- rbind(cats, additional_cat)
+cats2
+```
+
+``` output
+    coat     weight likes_catnip
+1 calico        2.1            1
+2  black          5            0
+3  tabby        3.2            1
+4  tabby 2.3 or 2.4            1
 ```
 
-Load the new cats data like before, and check what type of data we find in the
-`weight` column:
+Let's check what type of data we find in the
+`weight` column of our new `cats2` object:
 
 
 ``` r
-cats <- read.csv(file="data/feline-data_v2.csv")
-typeof(cats$weight)
+typeof(cats2$weight)
 ```
 
 ``` output
@@ -272,18 +280,18 @@ we did on them before, we run into trouble:
 
 
 ``` r
-cats$weight + 2
+cats2$weight + 2
 ```
 
 ``` error
-Error in cats$weight + 2: non-numeric argument to binary operator
+Error in cats2$weight + 2: non-numeric argument to binary operator
 ```
 
 What happened?
-The `cats` data we are working with is something called a *data frame*. Data frames
+The `cats` (and `cats2`) data we are working with is something called a *data frame*. Data frames
 are one of the most common and versatile types of *data structures* we will work with in R.
 A given column in a data frame cannot be composed of different data types.
-In this case, R does not read everything in the data frame column `weight` as a *double*, therefore the entire
+In this case, R cannot store everything in the data frame column `weight` as a *double* anymore once we add the row for the additional cat (because its weight is `2.3 or 2.4`), therefore the entire
 column data type changes to something that is suitable for everything in the column.
 
 When R reads a csv file, it reads it in as a *data frame*. Thus, when we loaded the `cats`
@@ -292,42 +300,22 @@ is written by the `str()` function:
 
 
 ``` r
-str(cats)
+str(cats2)
 ```
 
 ``` output
 'data.frame':	4 obs. of  3 variables:
  $ coat        : chr  "calico" "black" "tabby" "tabby"
  $ weight      : chr  "2.1" "5" "3.2" "2.3 or 2.4"
- $ likes_string: int  1 0 1 1
+ $ likes_catnip: num  1 0 1 1
 ```
 
 *Data frames* are composed of rows and columns, where each column has the
 same number of rows. Different columns in a data frame can be made up of different
 data types (this is what makes them so versatile), but everything in a given
 column needs to be the same type (e.g., vector, factor, or list).
 
-Let's explore more about different data structures and how they behave.
-For now, let's remove that extra line from our cats data and reload it,
-while we investigate this behavior further:
-
-feline-data.csv:
-
-```
-coat,weight,likes_catnip
-calico,2.1,1
-black,5.0,0
-tabby,3.2,1
-```
-
-And back in RStudio:
-
-
-``` r
-cats <- read.csv(file="data/feline-data.csv")
-```
-
-
+Let's explore more about different data structures and how they behave. For now, we will focus on our original data frame `cats` (and we can forget about `cats2` for the rest of this episode).
 
 ### Vectors and Type Coercion
 
@@ -555,8 +543,7 @@ Create a new script in RStudio and copy and paste the following code. Then
 move on to the tasks below, which help you to fill in the gaps (\_\_\_\_\_\_).
 
 ```
-# Read data
-cats <- read.csv("data/feline-data_v2.csv")
+Using the object `cats2`:
 
 # 1. Print the data
 _____
@@ -568,15 +555,15 @@ _____(cats)
 #    The correct data type is: ____________.
 
 # 4. Correct the 4th weight data point with the mean of the two given values
-cats$weight[4] <- 2.35
+cats2$weight[4] <- 2.35
 #    print the data again to see the effect
 cats
 
 # 5. Convert the weight to the right data type
-cats$weight <- ______________(cats$weight)
+cats2$weight <- ______________(cats2$weight)
 
 #    Calculate the mean to test yourself
-mean(cats$weight)
+mean(cats2$weight)
 
 # If you see the correct mean value (and not NA), you did the exercise
 # correctly!
@@ -586,7 +573,7 @@ mean(cats$weight)
 
 #### 1\. Print the data
 
-Execute the first statement (`read.csv(...)`). Then print the data to the
+Print the data to the
 console
 
 :::::::::::::::  solution
@@ -601,8 +588,8 @@ Show the content of any variable by typing its name.
 Two correct solutions:
 
 ```
-cats
-print(cats)
+cats2
+print(cats2)
 ```
 
 :::::::::::::::::::::::::
@@ -611,7 +598,7 @@ print(cats)
 
 The data type of your data is as important as the data itself. Use a
 function we saw earlier to print out the data types of all columns of the
-`cats` table.
+`cats2` `data.frame`.
 
 :::::::::::::::  solution
 
@@ -628,15 +615,14 @@ here.
 > ### Solution to Challenge 1.2
 > 
 > ```
-> str(cats)
+> str(cats2)
 > ```
 
 #### 3\. Which data type do we need?
 
 The shown data type is not the right one for this data (weight of
 a cat). Which data type do we need?
 
-- Why did the `read.csv()` function not choose the correct data type?
 - Fill in the gap in the comment with the correct data type for cat weight!
 
 :::::::::::::::  solution
@@ -715,8 +701,8 @@ auto-complete function: Type "`as.`" and then press the TAB key.
 > There are two functions that are synonymous for historic reasons:
 > 
 > ```
-> cats$weight <- as.double(cats$weight)
-> cats$weight <- as.numeric(cats$weight)
+> cats2$weight <- as.double(cats2$weight)
+> cats2$weight <- as.numeric(cats2$weight)
 > ```
 
 ::::::::::::::::::::::::::::::::::::::::::::::::::
diff --git a/data/feline-data.csv b/data/feline-data.csv
@@ -1,4 +1,4 @@
-"coat","weight","likes_string"
+"coat","weight","likes_catnip"
 "calico",2.1,1
 "black",5,0
 "tabby",3.2,1
diff --git a/md5sum.txt b/md5sum.txt
@@ -6,7 +6,7 @@
 "episodes/01-rstudio-intro.Rmd" "04f6b758558750cef962768d78dd63b0" "site/built/01-rstudio-intro.md" "2024-12-03"
 "episodes/02-project-intro.Rmd" "cd60cc3116d4f6be92f03f5cc51bcc3b" "site/built/02-project-intro.md" "2024-12-03"
 "episodes/03-seeking-help.Rmd" "d24c310b8f36930e70379458f3c93461" "site/built/03-seeking-help.md" "2024-12-03"
-"episodes/04-data-structures-part1.Rmd" "afc6c3ced3677ab088457152f8d84b54" "site/built/04-data-structures-part1.md" "2024-12-03"
+"episodes/04-data-structures-part1.Rmd" "5e680e381a7d16228ee1ee2c9ec8a151" "site/built/04-data-structures-part1.md" "2025-01-07"
 "episodes/05-data-structures-part2.Rmd" "95c5dd30b8288090ce89ecbf2d3072bd" "site/built/05-data-structures-part2.md" "2024-12-03"
 "episodes/06-data-subsetting.Rmd" "5d4ce8731ab37ddea81874d63ae1ce86" "site/built/06-data-subsetting.md" "2024-12-03"
 "episodes/07-control-flow.Rmd" "6a8691c8668737e4202f49b52aeb8ac6" "site/built/07-control-flow.md" "2024-12-03"