Spencer Nystrom
- Understand basic
ggplotsyntax - Know how
ggplotviews factors and how to relevel them for plotting
Install the gapminder dataset
install.packages("gapminder")Next we will load the sample data and subset it. We will cover how to do
this in detail next week, but for now run the following code to generate
a data.frame tracking life-expectancy, population, & gdp over time in
the United States.
library(gapminder)
# We'll cover dplyr next week!
usa <- dplyr::filter(gapminder, country == "United States")Don’t forget to load the ggplot2 library
library(ggplot2)
## Registered S3 methods overwritten by 'ggplot2':
## method from
## [.quosures rlang
## c.quosures rlang
## print.quosures rlang- Recall last class’s lesson on ggplot. Try to recreate the following scatterplot of life expectancy over time:
ggplot adds things in layers. The order of addition matters! Lets add a line to our scatterplot. Observe that the line is plotted over the points.
I’ve increased the line thickness with the size argument so it renders
well on a projector.
ggplot(data = usa, mapping = aes(x = year, y = lifeExp)) +
geom_point(shape = 18, size = 5) +
geom_line(color = "red", size = 1)- set the point size to scale with the population (
popcolumn) - what happens if you set the color to scale with the population?
- what happens if you set the color to distinguish the continent?

Recall the following example:
ggplot(iris, aes(Sepal.Width, Petal.Width)) +
geom_point(aes(color = Species))ggplot interprets categorical variables as factors in order to assign
their order in the legend, and the color on the plot.
By default the levels are in alphabetical order.
Note how in the below example the ‘levels’ of myFactors are arranged
alphabetically.
myFactors <- factor(c("One", "Two", "Two", "Three"))
myFactors
## [1] One Two Two Three
## Levels: One Three TwoIn the iris dataset, Species is already a factor
iris$Species
## [1] setosa setosa setosa setosa setosa setosa
## [7] setosa setosa setosa setosa setosa setosa
## [13] setosa setosa setosa setosa setosa setosa
## [19] setosa setosa setosa setosa setosa setosa
## [25] setosa setosa setosa setosa setosa setosa
## [31] setosa setosa setosa setosa setosa setosa
## [37] setosa setosa setosa setosa setosa setosa
## [43] setosa setosa setosa setosa setosa setosa
## [49] setosa setosa versicolor versicolor versicolor versicolor
## [55] versicolor versicolor versicolor versicolor versicolor versicolor
## [61] versicolor versicolor versicolor versicolor versicolor versicolor
## [67] versicolor versicolor versicolor versicolor versicolor versicolor
## [73] versicolor versicolor versicolor versicolor versicolor versicolor
## [79] versicolor versicolor versicolor versicolor versicolor versicolor
## [85] versicolor versicolor versicolor versicolor versicolor versicolor
## [91] versicolor versicolor versicolor versicolor versicolor versicolor
## [97] versicolor versicolor versicolor versicolor virginica virginica
## [103] virginica virginica virginica virginica virginica virginica
## [109] virginica virginica virginica virginica virginica virginica
## [115] virginica virginica virginica virginica virginica virginica
## [121] virginica virginica virginica virginica virginica virginica
## [127] virginica virginica virginica virginica virginica virginica
## [133] virginica virginica virginica virginica virginica virginica
## [139] virginica virginica virginica virginica virginica virginica
## [145] virginica virginica virginica virginica virginica virginica
## Levels: setosa versicolor virginicamyFactors2 <- factor(c("One", "Two", "Two", "Three"), levels = c("One", "Two", "Three"))
myFactors2
## [1] One Two Two Three
## Levels: One Two ThreeUse the forcats library! It’s built for dealing with factors!
library(forcats)
myFactors
## [1] One Two Two Three
## Levels: One Three Twoall forcats functions start with fct_
fct_relevel lets you reorder factors
fct_relevel(myFactors, c("One", "Two", "Three"))
## [1] One Two Two Three
## Levels: One Two Threefct_rev reverses the order of levels
myFactors2
## [1] One Two Two Three
## Levels: One Two Threefct_rev(myFactors2)
## [1] One Two Two Three
## Levels: Three Two Oneggplot(iris, aes(Sepal.Width, Petal.Width)) +
geom_point(aes(color = Species))The order of the factors in the legend (and thus the default color assignment) can be changed by reordering the factor levels!
ggplot(iris, aes(Sepal.Width, Petal.Width)) +
geom_point(aes(color = fct_rev(Species)))For these examples I’m going to use the whole gapminder dataset. To save
typing I will save a scatterplot of year vs lifeExp to
myPlot.
myPlot <- ggplot(data = gapminder, mapping = aes(x = year, y = lifeExp)) +
geom_point()
myPlotTitles can be added with ggtitle()
myPlot +
ggtitle("Year vs Life Expectancy")Axis titles can be customized with xlab() and ylab()
myPlot +
ggtitle("Year vs Life Expectancy") +
xlab("Year") +
ylab("Life Expectancy")myPlot +
ggtitle("Year vs Life Expectancy") +
xlab("Year") +
ylab("Life Expectancy")global aesthetics can be added to the whole plot
myPlot +
aes(color = continent)To check your understanding of ggplot and topics covered in this
lesson, replicate the plot below of data from the Gapminder dataset.
Make use of your resources in this document, the ggplot2
documentation, R for Data Science, and Google.
Hints:
- this plot uses the following subset of the Gapminder data
- this plot uses facets to show the different years
- a linear trendline
- log10 x-axis
- altered transparency of the points
- renamed legend titles
- axis text is resized to 12
- axis title text is resized to 14
- facet titles are resized to 14
finalExampleData <- dplyr::filter(gapminder, year %in% c(1952, 2007))











