Fix wrong p value ggcuminc#261
Conversation
|
thanks for submitting this PR and looking into the p-value issue @samrickman ! just looking through it on my end as well |
There was a problem hiding this comment.
hi @samrickman
I tried running a few examples and tests with your proposed fix, and I believe the current fix works for the first outcome but completely fails for the 2nd and 3rd outcomes in a 3+ competing events scenario - add_pvalue() gets ignored entirely for these cases. Please let me know if I'm mistaken!
I used this example as a starting point for 3+ competing events:
# Reproducible example showing p-value bug with 3+ competing events
library(ggsurvfit)
#> Loading required package: ggplot2
library(tidycmprsk)
library(dplyr)
# Create test data with 3 competing events
set.seed(123)
dat <- tidycmprsk::trial %>%
mutate(
death_cr_3way = dplyr::case_when(
death_cr == "censor" ~ "censor",
death_cr == "death other causes" ~ "death other causes",
death_cr == "death from cancer" & dplyr::row_number() %% 2 == 1 ~ "cancer type A",
death_cr == "death from cancer" & dplyr::row_number() %% 2 == 0 ~ "cancer type B",
TRUE ~ "censor"
),
death_cr_3way = factor(death_cr_3way,
levels = c("censor", "cancer type A", "cancer type B", "death other causes"))
)
# Fit competing risks model
cif <- tidycmprsk::cuminc(Surv(ttdeath, death_cr_3way) ~ trt, data = dat)
# Check what p-values should be shown
tidycmprsk::glance(cif)
#> # A tibble: 1 × 12
#> outcome_1 statistic_1 df_1 p.value_1 outcome_2 statistic_2 df_2 p.value_2
#> <chr> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <dbl>
#> 1 cancer type… 0.988 1 0.320 cancer t… 0.504 1 0.478
# Create plots for each event
plot1 <- cif %>% ggcuminc(outcome = "cancer type A") + add_pvalue()
plot2 <- cif %>% ggcuminc(outcome = "cancer type B") + add_pvalue()
#> ! `add_pvalue()` works with objects created with `survfit2()` or `tidycmprsk::cuminc()`.
#> ℹ `add_pvalue()` has been ignored.
plot3 <- cif %>% ggcuminc(outcome = "death other causes") + add_pvalue()
#> ! `add_pvalue()` works with objects created with `survfit2()` or `tidycmprsk::cuminc()`.
#> ℹ `add_pvalue()` has been ignored.
print(paste("Plot 1:", plot1$labels$caption))
#> [1] "Plot 1: p=0.3"
print(paste("Plot 2:", plot2$labels$caption))
#> [1] "Plot 2: "
print(paste("Plot 3:", plot3$labels$caption))
#> [1] "Plot 3: "There was a problem hiding this comment.
@ShreyaSreeram27 I'm not going to have a chance to look at this for a while but I'm sure you're correct. I thought that ggcuminc() could only handle competing risks models (i.e. two states excluding censoring: death, death from another cause). If it can handle models with multiple states then what I wrote will not work. The only thing I'm confused about is that tidycmprsk::glance(cif) also seems to be assuming that it's a two state model (i.e. there are only two outcomes and two p-values). I'm pretty sure that tidycmprsk is just a nice API for cmprsk which implements Fine & Gray models. As far as I'm aware, these can only have two states (excluding censoring). So is it supposed to work with more than this? If it is then it's back to the drawing board for this PR I'm afraid.
There was a problem hiding this comment.
Thank you both! @samrickman @ShreyaSreeram27 !
@ShreyaSreeram27 can you confirm whether cmprsk supports 2+ competing events?
- if it does, can you then confirm tidycmprsk (and the pvalue calcualtion therein) also supports 2+ competing events
- if it doesn't, we'll need to update the messaging to users when this is passed.
There was a problem hiding this comment.
hi @ddsjoberg i have confirmed that both cmprsk and tidycmprsk support 2+ competing events. I believe the following code example helps us see this!
library(tidycmprsk)
library(dplyr)
#create test data with 3 competing events
set.seed(123)
n <- 200
test_data <- data.frame(
time = rexp(n, rate = 0.1),
status = sample(c("censor", "event1", "event2", "event3"),
n, replace = TRUE,
prob = c(0.4, 0.2, 0.2, 0.2)),
group = sample(c("A", "B"), n, replace = TRUE)
)
test_data$status <- factor(test_data$status,
levels = c("censor", "event1", "event2", "event3"))
print("Test data structure:")
print(table(test_data$status))
print(table(test_data$group, test_data$status))
# tidycmprsk with 3 events
print("\n===tidycmprsk::cuminc with 3 competing events ===")
tryCatch({
cif_3events <- tidycmprsk::cuminc(Surv(time, status) ~ group, data = test_data)
print("SUCCESS")
print(cif_3events)
}, error = function(e) {
print("FAILED")
print(paste("Error:", e$message))
})
There was a problem hiding this comment.
@ShreyaSreeram27 OK back to stats class for me then! Clearly this PR shouldn't be merged. If I get the chance I will have a think about how to generalise this to >2 events.
There was a problem hiding this comment.
thank you @ShreyaSreeram27 and @samrickman !
@ShreyaSreeram27 perhaps a first step to support is better messaging, e.g. can we check if more than one outcome is being presented or the single outcome being presented is not the first? If either of those conditions, print a warning to the users?
There was a problem hiding this comment.
thanks @samrickman , happy to collaborate on this feature as well!
There was a problem hiding this comment.
@ddsjoberg yes i think that's a good way to go about this! ill work on improving the messaging
There was a problem hiding this comment.
@ddsjoberg i was thinking about the following potential scenarios from a user's perspective:
Scenario 1: Multi-outcome model, plotting 1st outcome
and somethihg like: ! add_pvalue() has limited support for competing risks models with multiple outcomes.
Your model has 3 competing events: cancer type A, cancer type B, death other causes
You are plotting outcome 'cancer type A' (position #1)
P-values may not be correctly calculated for models with multiple competing events.
✓ Using p-value for 'cancer type A' (the first outcome in your 3-event competing risks model).
Scenario 2: Multi-outcome model, plotting 2nd+ outcome
! add_pvalue() has limited support for competing risks models with multiple outcomes.
Your model has 3 competing events: cancer type A, cancer type B, death other causes
You are plotting outcome 'cancer type B' (position #2)
Currently, p-values are only reliable for the first outcome in competing risks models.
There was a problem hiding this comment.
The model can have any number of competing events (or outcomes), but we cannot present multiple p-values.
Something like this should work:
all_competing_events # this is a character vector of all the competing events from the model
plot_competing_events # character vector of the competing events in the figure
if (length(plot_competing_events) > 1L || plot_competing_events != all_competing_events[1]) {
cli::cli_warn(
c("{.fun add_pvalue} supports reporting a single competing event display for the {.emph first} competing event {.val {plot_competing_events[1]}}.")
)
}|
Thank you @samrickman for the PR and making the competing events comparisons more clear! I think we can close this PR for now. The current functionality only allows us to present the pvalue from figures that show a single competing event (regardless of whether is the 1st, 2nd, or n-th competing event). But we do not report multiple pvalues with
|

What changes are proposed in this pull request?
ggsurvfit()where it shows same p-value for both events.ggsurvfit()are different (usingtidycmprsk::trialdata, where they should be).I still get five "failed" tests where
vdiffr::expect_doppelganger()has different results. This happened last time I pushed to this repo and all was fine. I think it's related to my system and how it draws plots (they look fine to me), so I've marked all tests as passed and I'll see what happens in the CI pipeline.If there is an GitHub issue associated with this pull request, please provide link.
#239 (comment)
Reviewer Checklist (if item does not apply, mark as complete)
renv::install()_pkgdown.ymlpkgdown::build_site(). Check the R console for errors, and review the rendered website.withr::with_envvar(list(CI = TRUE), code = devtools::test_coverage()). Begin in a fresh R session without any packages loaded.usethis::use_spell_check()runs with no spelling errors in documentationWhen the branch is ready to be merged into master:
NEWS.mdwith the changes from this pull request under the heading "# ggsurvfit (development version)". If there is an issue associated with the pull request, reference it in parentheses at the end update (seeNEWS.mdfor examples).usethis::use_version(which = "dev")usethis::use_spell_check()again