You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+38-3
Original file line number
Diff line number
Diff line change
@@ -34,6 +34,7 @@ It requires R, RStudio, and the rmarkdown package.
34
34
11.[Friends Don't Let Friends Make Concentric Donuts](https://github.com/cxli233/FriendsDontLetFriends#11-friends-dont-let-friends-make-concentric-donuts)
35
35
12.[Friends Don't Let Friends Use Red/green and Rainbow for Color Scales](https://github.com/cxli233/FriendsDontLetFriends#12-friends-dont-let-friends-use-redgreen-and-rainbow-color-scales)
36
36
13.[Friends Don't Let Friends Forget to Reorder Stacked Bar Plot](https://github.com/cxli233/FriendsDontLetFriends/tree/main#13-friends-dont-let-friends-forget-to-reorder-stacked-bar-plot)
37
+
14.[Friends Don't Let Friends Mix Stacked Bars and Mean separation](https://github.com/cxli233/FriendsDontLetFriends/tree/main#14-friends-dont-let-friends-mix-stacked-bars-and-mean-separation)
37
38
38
39
# 1. Friends Don't Let Friends Make Bar Plots for Means Separation
39
40
@@ -48,7 +49,7 @@ In this example, two groups have similar means and standard deviations, but quit
48
49
Just don't use bar plot for means separation, or at least check a couple things before settling down on a bar plot.
49
50
50
51
It's worth mentioning that I was inspired by many researchers who have tweeted on the limitation of bar graphs.
51
-
Here is a pulication: [Weissgerber et al., 2015, PLOS Biology](https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002128).
52
+
Here is a publication: [Weissgerber et al., 2015, PLOS Biology](https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002128).
52
53
53
54
# 2. Friends Don't Let Friends Make Violin Plots for Small Sample Sizes
54
55
@@ -71,7 +72,7 @@ I can understand why this error is common, because it appears that many of us ha
71
72
Color scales are pretty, but we have to be extra careful.
72
73
When color scales (or color gradients) are used to represent numerical data, the darkest and lightest colors should have special meanings.
73
74
You can decide what those special meanings are: e.g., max, min, mean, zero. But they should represent something meaningful.
74
-
A data visualization sin for heat maps/color gradients is when the lightest or darkers colors are some arbitrary numbers.
75
+
A data visualization sin for heat maps/color gradients is when the lightest or darkest colors are some arbitrary numbers.
75
76
*This is as bad as the longest bar in a bar chart not being the largest value.* Can you imagine that?
76
77
77
78
# 4. Friends Don't Let Friends Make Bar Plot Meadow
@@ -143,7 +144,7 @@ This is because the concentration of compound 1 has a much narrower range than t
143
144
# 8. Friends Don't Let Friends Make Network Graphs without Trying Different Layouts
144
145
145
146
Network graphs are common in scientific publications. They are super useful in presenting relationship data.
146
-
However, the apparence (not the topology) of the network can make a huge difference in determing if a network graph is effective.
147
+
However, the appearance (not the topology) of the network can make a huge difference in determining if a network graph is effective.
147
148
148
149

149
150
@@ -243,6 +244,40 @@ Due to the number of samples and classes, it is very hard to discern anything fr
243
244
After reordering the bars, __wow__, that really made a difference, don't you think?
244
245
For a tutorial on how to optimize a stack bar plot, see [this script](https://github.com/cxli233/FriendsDontLetFriends/blob/main/Scripts/stacked_bars_optimization.Rmd).
245
246
247
+
# 14. Friends Don't Let Friends Mix Stacked Bars and Mean separation
248
+
Sometimes a visualization gets confusing and ineffective when it tries to too many things at once.
249
+
One such example is mixing stacked bar plots and mean separation plots.
250
+
One displays proportional data adding up to 100%, the other displays the difference in means and dispersion around means.
251
+
These are very distinct tasks in data visualization.
252
+
253
+
In this hypothetical experiment, we had blueberry plants assigned to two groups.
254
+
One group was the control; the other was treated with a chemical to make fruit development faster.
255
+
Each group had 5 plants. The response of the treatment was divided into 3 categories:
256
+
light green fruits, light blue fruits, and dark blue fruits.
257
+
100 fruits from each plant were examined and the number of fruits in each category was counted.
258
+
The percentage of fruits in each category was calculated and reported.
259
+
The question of the study is: did the chemical treatment work?
260
+
261
+

262
+
263
+
The first stacked bar plot is fine as the standard way to visualize proportion data.
264
+
It is clear that all categories add up to 100%,
265
+
and the chemical treatment strongly shifted the color profile towards the most developed stage (dark blue).
266
+
267
+
The middle stacked bar plot is problematic,
268
+
mainly because it is trying to do two distinct data visualization tasks at once.
269
+
When error bars and dots are overlaid onto the stacked bars,
270
+
it become unclear which error bars and dots are being compared.
271
+
Due to the nature of stacked bars, the error bars and dots of the upper stacks have to be shifted upwards,
272
+
and thus interpretation of the y-axis for error bars and dots become not straightforward.
273
+
274
+
Finally, if the main point of the visualization is mean separation and dispersion around the mean,
275
+
the third graph is the better choice.
276
+
There is no ambiguity on which comparisons are being made.
277
+
As shown in the first stacked bar plot,
278
+
the chemical treatment strongly increases the proportion of dark blue fruits,
279
+
at the expense of lighter color fruits.
280
+
246
281
# Conclusion (?)
247
282
248
283
That's it for now. I will update this when I have the time (and inspirations) to produce more examples.
0 commit comments