Skip to content

geom_signif - all comparisons dissappear when one comparison has missings #126

@MPietzke

Description

@MPietzke

Initially posting it at ggpubr (kassambara/ggpubr#503) however this is just a shameless wrapper for geom_signif() - so maybe it's better suited here!?

When using geom_signif() to make multiple comparisons it works fine, until one of the comparisons cannot be performed (e.g. due to too many missings). In this case also all the possible comparisons dissappear!
Please see this example:

# A dataset with some NAs 
dataset = tibble(
  "Sample" = rep(c("Sample1", "Sample2"), each = 15),
  "Cond"   = rep(c("A", "B", "C",
                   "A", "B", "C"), each = 5),
  "Rep"    = rep(1:5, 6),
  "Value"  = c(runif(5, 10, 12),  #A1
               runif(5, 11, 14),  #B1
               runif(5, 10, 13),  #C1
               runif(5, 10, 12),  #A2
               runif(5, 11, 14),  #B2
               c(runif(2, 10, 13), NA, NA, NA) #C2
  ))

# With min 2 datapoints we see all the comparisons we want to have!
ggplot(dataset, 
       aes(x = Cond, y = Value, 
           colour = as.factor(Cond),
           fill = as.factor(Cond) )) + 
  geom_jitter(size = 5, width = 0.2, alpha = 0.3, stroke = 1.5,
              shape = 21) + 
  stat_summary(fun.min = mean, fun.max = mean, size = 1.5,                
               geom='errorbar') + 
  facet_wrap( ~ Sample) +
  theme_bw()  + scale_y_continuous(limits = c(0, 16)) +
  geom_signif(comparisons = list(c("A", "B"),
                                 c("B", "C")),
              step_increase = 0.2,
              colour = "black") + 
  theme(legend.position = "none")

image

with only NAs in one of the conditions (C), the other comparisons (A-B) dissappers as well!

ggplot(data = filter(dataset, Rep >= 3), 
       aes(x = Cond, y = Value, 
           colour = as.factor(Cond),
           fill = as.factor(Cond) )) + 
  geom_jitter(size = 5, width = 0.2, alpha = 0.3, stroke = 1.5,
              shape = 21) + 
  stat_summary(fun.min = mean, fun.max = mean, size = 1,                
               geom='errorbar') + 
  facet_wrap( ~ Sample) +
  theme_bw()  + scale_y_continuous(limits = c(0, 16)) +
  geom_signif(comparisons = list(c("A", "B"),
                                 c("B", "C")),
              step_increase = 0.2,
              colour = "black") + 
  theme(legend.position = "none")

image

Here also the comparison A-B get lost, even though this can still be calculated.
One could adapt the comparisons made (after seeing it's not working in one of the cases) but in general I want to have a consistent picture over multiple (usually more than just 2) Samples .

It throws a warning, so at least the function allready know something fails:
1: Removed 3 rows containing non-finite values (stat_summary).
2: Removed 3 rows containing non-finite values (stat_signif).
3: Computation failed in stat_signif():
not enough 'y' observations.

Would it be possible to:

  • check (e.g. after the warning) which of the comparisons cannot be made,
  • remove the impossible one,
  • still show the working ones and maybe either just drop the failed comparison
  • or (better) add something as "n.d.", therefor maintaining the original structure?

This would be awesome!

PS: Just reading the proDA paper - then adding the issue here and noticing the identical name of the author!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions