-
Notifications
You must be signed in to change notification settings - Fork 314
The Datasaurus Dozen
Thomas Lin Pedersen edited this page Sep 4, 2018
·
4 revisions
submitted by Tom Westlake
The Datasaurus Dozen is a playful twist on Anscombe's Quartet. A group of twelve datasets, with nigh-identical summary statistics, yet when plotted on a graph they prove to be distinctly dissimilar.
The animation below, utilising the datasauRus, ggplot2 and gganimate packages, highlights the dangers of relying solely on summary statistics without considering the whole distribution
library(datasauRus)
library(ggplot2)
library(gganimate)
ggplot(datasaurus_dozen, aes(x=x, y=y))+
geom_point()+
theme_minimal() +
transition_states(dataset, 3, 1) +
ease_aes('cubic-in-out')
Install gganimate using devtools::install_github('thomasp85/gganimate')
The Grammar
Misc
Examples