surtvep

surtvep is an R package for fitting Cox non-proportional hazards models with time-varying coefficients. Both unpenalized procedures (Newton and proximal Newton) and penalized procedures (P-splines and smoothing splines) are included using B-spline basis functions for estimating time-varying coefficients. For penalized procedures, cross-validations, mAIC, TIC or GIC are implemented to select tuning parameters. Utilities for carrying out post-estimation visualization, summarization, point-wise confidence interval and hypothesis testing are also provided.

Introduction

Large-scale time-to-event data derived from national disease registries arise rapidly in medical studies. Detecting and accounting for time-varying effects is particularly important, as time-varying effects have already been reported in the clinical literature. However, there are currently no formal R packages for estimating the time-varying effects without pre-assuming the time-dependent function. Inaccurate pre-assumptions can greatly influence the estimation, leading to unreliable results. To address this issue, we developed a time-varying model using spline terms with penalization that does not require pre-assumption of the true time-dependent function, and implemented it in R.

Our package offers several benefits over traditional methods. Firstly, traditional methods for modeling time-varying survival models often rely on expanding the original data into a repeated measurement format. However, even with moderate sample sizes, this leads to a large and computationally burdensome working dataset. Our package addresses this issue by proposing a computationally efficient Kronecker product-based proximal algorithm, which allows for the evaluation of time-varying effects in large-scale studies. Additionally, our package allows for parallel computing and can handle moderate to large sample sizes more efficiently than current methods.

In our statistical software tutorial, we address a common issue encountered when analyzing data with binary covariates with near-zero variation. For example, in the SEER prostate cancer data, only 0.6% of the 716,553 patients had their tumors regional to the lymph nodes. In such cases, the associated observed information matrix of a Newton-type method may have a minimum eigenvalue close to zero and a large condition number. Inverting this nearly singular matrix can lead to numerical instability and the corresponding Newton updates may be confined within a small neighborhood of the initial value, resulting in estimates that are far from the optimal solutions. To address this problem, our proposed Proximal-Newtown method utilizes a modified Hessian matrix, which allows for accurate estimation in these scenarios.

Models:

Method	Description	Example
Newton	Newton's method and Proximal Newton's method [1].	tutorial
Newton's method with penalization	Newton's method and Proximal Newton combined with P-spline or Smoothing-spline [2].	tutorial

Penalzation Coefficient Selection Methods:

Method	Description	Example
mAIC	A modified Akaki Information Criterion [1].	tutorial
TIC	Takuchi Information Criterion [1].	tutorial
GIC	Takuchi Information Criterion [1].	tutorial
Cross Validation	Use cross validation to select the penalization coefficient [1].	tutorial

Usage:

Here, we are using the Simulation study included in our packages as an example

library(surtvep)

#Load Simulation study
sim_data=sim_data
#Clean and create label and covariate matrix for the package:
event=sim_data[,"event"]
time=sim_data[,"time"]
data=sim_data[,!colnames(sim_data) %in% c("event","time")]

#Fit the model(Time varying model without penalty)

fit <- coxtp(event = event, z = data, time = time)
coxtp.plot(fit,coef="V1")

Datasets

The SUPPORT dataset is available in the "surtvep" package. The following code will load the dataset in the form of a dataframe

data("support")

Simulated Datasets:

Dataset	Size	Dataset
ExampleData	4,000	A simulated data set containing 2 continuous variables.
ExampleDataBinary	2,000	A simulated data set containing 2 binary variables.
StrataExample	2,000	A simulated data set containing 2 binary variables. Subjects in different strata have

Real Datasets:

for preprocessing.

Dataset	Size	Dataset	Data source
SUPPORT		The support dataset is a random sample of 1000 patients from Phases I & II of SUPPORT (Study to Understand Prognoses Preferences Outcomes and Risks of Treatment). This dataset is very good for learning how to fit highly nonlinear predictor effects. See tutorial	source

Installation

Note: This package is still in its early stages of development, so please don't hesitate to report any problems you may experience.

The package only works for R 4.1.0+.

You can install 'surtvep' via:

#Install the package, need to install the devtools packages:
install.packages("devtools")
require("remotes")
remotes::install_github("UM-KevinHe/surtvep", ref = "openmp")

We recommand to start with tutorial, as it provides an overview of the package's usage, including preprocessing, model training, selection of penalization parameters, and post-estimation procedures.

Detailed tutorial

For detailed tutorial and model paramter explaination, please go to here

Getting Help:

If you encounter any problems or bugs, please contact us at: lfluo@umich.edu{.email}, xuetao@umich.edu{.email}

References

[1] Wenbo Wu, Jeremy M G Taylor, Andrew F Brouwer, Lingfeng Luo, Jian Kang, Hui Jiang and Kevin He. Scalable proximal Methods for cause-specific hazard modeling with time-varying coefficients. Lifetime Data Analysis, 28(2):194-218, 2022. [paper]

Name		Name	Last commit message	Last commit date
Latest commit History 141 Commits
R		R
data		data
docs		docs
man		man
src-i386		src-i386
src-x64		src-x64
src		src
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
NAMESPACE		NAMESPACE
README.md		README.md
_pkgdown.yml		_pkgdown.yml
support.rda		support.rda
surtvep.Rproj		surtvep.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

surtvep

Introduction

Models:

Penalzation Coefficient Selection Methods:

Usage:

Datasets

Simulated Datasets:

Real Datasets:

Installation

Detailed tutorial

Getting Help:

References

About

Uh oh!

Releases

Packages

Languages

Qinmengge/surtvep

Folders and files

Latest commit

History

Repository files navigation

surtvep

Introduction

Models:

Penalzation Coefficient Selection Methods:

Usage:

Datasets

Simulated Datasets:

Real Datasets:

Installation

Detailed tutorial

Getting Help:

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages