Skip to content

Commit 0474461

Browse files
authored
Update README.md
Added section 14: don't mix stacked bar plots and mean separation.
1 parent a75454e commit 0474461

File tree

1 file changed

+38
-3
lines changed

1 file changed

+38
-3
lines changed

README.md

+38-3
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@ It requires R, RStudio, and the rmarkdown package.
3434
11. [Friends Don't Let Friends Make Concentric Donuts](https://github.com/cxli233/FriendsDontLetFriends#11-friends-dont-let-friends-make-concentric-donuts)
3535
12. [Friends Don't Let Friends Use Red/green and Rainbow for Color Scales](https://github.com/cxli233/FriendsDontLetFriends#12-friends-dont-let-friends-use-redgreen-and-rainbow-color-scales)
3636
13. [Friends Don't Let Friends Forget to Reorder Stacked Bar Plot](https://github.com/cxli233/FriendsDontLetFriends/tree/main#13-friends-dont-let-friends-forget-to-reorder-stacked-bar-plot)
37+
14. [Friends Don't Let Friends Mix Stacked Bars and Mean separation](https://github.com/cxli233/FriendsDontLetFriends/tree/main#14-friends-dont-let-friends-mix-stacked-bars-and-mean-separation)
3738

3839
# 1. Friends Don't Let Friends Make Bar Plots for Means Separation
3940

@@ -48,7 +49,7 @@ In this example, two groups have similar means and standard deviations, but quit
4849
Just don't use bar plot for means separation, or at least check a couple things before settling down on a bar plot.
4950

5051
It's worth mentioning that I was inspired by many researchers who have tweeted on the limitation of bar graphs.
51-
Here is a pulication: [Weissgerber et al., 2015, PLOS Biology](https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002128).
52+
Here is a publication: [Weissgerber et al., 2015, PLOS Biology](https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002128).
5253

5354
# 2. Friends Don't Let Friends Make Violin Plots for Small Sample Sizes
5455

@@ -71,7 +72,7 @@ I can understand why this error is common, because it appears that many of us ha
7172
Color scales are pretty, but we have to be extra careful.
7273
When color scales (or color gradients) are used to represent numerical data, the darkest and lightest colors should have special meanings.
7374
You can decide what those special meanings are: e.g., max, min, mean, zero. But they should represent something meaningful.
74-
A data visualization sin for heat maps/color gradients is when the lightest or darkers colors are some arbitrary numbers.
75+
A data visualization sin for heat maps/color gradients is when the lightest or darkest colors are some arbitrary numbers.
7576
*This is as bad as the longest bar in a bar chart not being the largest value.* Can you imagine that?
7677

7778
# 4. Friends Don't Let Friends Make Bar Plot Meadow
@@ -143,7 +144,7 @@ This is because the concentration of compound 1 has a much narrower range than t
143144
# 8. Friends Don't Let Friends Make Network Graphs without Trying Different Layouts
144145

145146
Network graphs are common in scientific publications. They are super useful in presenting relationship data.
146-
However, the apparence (not the topology) of the network can make a huge difference in determing if a network graph is effective.
147+
However, the appearance (not the topology) of the network can make a huge difference in determining if a network graph is effective.
147148

148149
![Try different network layouts](https://github.com/cxli233/FriendsDontLetFriends/blob/main/Results/TryDifferentLayouts.svg)
149150

@@ -243,6 +244,40 @@ Due to the number of samples and classes, it is very hard to discern anything fr
243244
After reordering the bars, __wow__, that really made a difference, don't you think?
244245
For a tutorial on how to optimize a stack bar plot, see [this script](https://github.com/cxli233/FriendsDontLetFriends/blob/main/Scripts/stacked_bars_optimization.Rmd).
245246

247+
# 14. Friends Don't Let Friends Mix Stacked Bars and Mean separation
248+
Sometimes a visualization gets confusing and ineffective when it tries to too many things at once.
249+
One such example is mixing stacked bar plots and mean separation plots.
250+
One displays proportional data adding up to 100%, the other displays the difference in means and dispersion around means.
251+
These are very distinct tasks in data visualization.
252+
253+
In this hypothetical experiment, we had blueberry plants assigned to two groups.
254+
One group was the control; the other was treated with a chemical to make fruit development faster.
255+
Each group had 5 plants. The response of the treatment was divided into 3 categories:
256+
light green fruits, light blue fruits, and dark blue fruits.
257+
100 fruits from each plant were examined and the number of fruits in each category was counted.
258+
The percentage of fruits in each category was calculated and reported.
259+
The question of the study is: did the chemical treatment work?
260+
261+
![Don't mix stacked bar plots with mean separation plots](https://github.com/cxli233/FriendsDontLetFriends/blob/main/Results/stacked_bar_vs_jitter.png)
262+
263+
The first stacked bar plot is fine as the standard way to visualize proportion data.
264+
It is clear that all categories add up to 100%,
265+
and the chemical treatment strongly shifted the color profile towards the most developed stage (dark blue).
266+
267+
The middle stacked bar plot is problematic,
268+
mainly because it is trying to do two distinct data visualization tasks at once.
269+
When error bars and dots are overlaid onto the stacked bars,
270+
it become unclear which error bars and dots are being compared.
271+
Due to the nature of stacked bars, the error bars and dots of the upper stacks have to be shifted upwards,
272+
and thus interpretation of the y-axis for error bars and dots become not straightforward.
273+
274+
Finally, if the main point of the visualization is mean separation and dispersion around the mean,
275+
the third graph is the better choice.
276+
There is no ambiguity on which comparisons are being made.
277+
As shown in the first stacked bar plot,
278+
the chemical treatment strongly increases the proportion of dark blue fruits,
279+
at the expense of lighter color fruits.
280+
246281
# Conclusion (?)
247282

248283
That's it for now. I will update this when I have the time (and inspirations) to produce more examples.

0 commit comments

Comments
 (0)