-
Notifications
You must be signed in to change notification settings - Fork 46
Description
Initially posting it at ggpubr (kassambara/ggpubr#503) however this is just a shameless wrapper for geom_signif() - so maybe it's better suited here!?
When using geom_signif() to make multiple comparisons it works fine, until one of the comparisons cannot be performed (e.g. due to too many missings). In this case also all the possible comparisons dissappear!
Please see this example:
# A dataset with some NAs
dataset = tibble(
"Sample" = rep(c("Sample1", "Sample2"), each = 15),
"Cond" = rep(c("A", "B", "C",
"A", "B", "C"), each = 5),
"Rep" = rep(1:5, 6),
"Value" = c(runif(5, 10, 12), #A1
runif(5, 11, 14), #B1
runif(5, 10, 13), #C1
runif(5, 10, 12), #A2
runif(5, 11, 14), #B2
c(runif(2, 10, 13), NA, NA, NA) #C2
))
# With min 2 datapoints we see all the comparisons we want to have!
ggplot(dataset,
aes(x = Cond, y = Value,
colour = as.factor(Cond),
fill = as.factor(Cond) )) +
geom_jitter(size = 5, width = 0.2, alpha = 0.3, stroke = 1.5,
shape = 21) +
stat_summary(fun.min = mean, fun.max = mean, size = 1.5,
geom='errorbar') +
facet_wrap( ~ Sample) +
theme_bw() + scale_y_continuous(limits = c(0, 16)) +
geom_signif(comparisons = list(c("A", "B"),
c("B", "C")),
step_increase = 0.2,
colour = "black") +
theme(legend.position = "none")
with only NAs in one of the conditions (C), the other comparisons (A-B) dissappers as well!
ggplot(data = filter(dataset, Rep >= 3),
aes(x = Cond, y = Value,
colour = as.factor(Cond),
fill = as.factor(Cond) )) +
geom_jitter(size = 5, width = 0.2, alpha = 0.3, stroke = 1.5,
shape = 21) +
stat_summary(fun.min = mean, fun.max = mean, size = 1,
geom='errorbar') +
facet_wrap( ~ Sample) +
theme_bw() + scale_y_continuous(limits = c(0, 16)) +
geom_signif(comparisons = list(c("A", "B"),
c("B", "C")),
step_increase = 0.2,
colour = "black") +
theme(legend.position = "none")
Here also the comparison A-B get lost, even though this can still be calculated.
One could adapt the comparisons made (after seeing it's not working in one of the cases) but in general I want to have a consistent picture over multiple (usually more than just 2) Samples .
It throws a warning, so at least the function allready know something fails:
1: Removed 3 rows containing non-finite values (stat_summary).
2: Removed 3 rows containing non-finite values (stat_signif).
3: Computation failed in stat_signif():
not enough 'y' observations.
Would it be possible to:
- check (e.g. after the warning) which of the comparisons cannot be made,
- remove the impossible one,
- still show the working ones and maybe either just drop the failed comparison
- or (better) add something as "n.d.", therefor maintaining the original structure?
This would be awesome!
PS: Just reading the proDA paper - then adding the issue here and noticing the identical name of the author!

