Learn R Programming

survminer (version 0.2.4)

surv_cutpoint: Determine the Optimal Cutpoint for Continuous Variables

Description

Determine the optimal cutpoint for one or multiple continuous variables at once, using the maximally selected rank statistics from the 'maxstat' R package. This is an outcome-oriented methods providing a value of a cutpoint that correspond to the most significant relation with outcome (here, survival).

Usage

surv_cutpoint(data, time = "time", event = "event", variables, minprop = 0.1, progressbar = TRUE)
surv_categorize(x, variables = NULL, labels = c("low", "high"))
"summary"(object, ...)
"print"(x, ...)
"plot"(x, variables = NULL, ggtheme = theme_classic2(), bins = 30, ...)
"print"(x, ...)

Arguments

data
a data frame containing survival information (time, event) and continuous variables (e.g.: gene expression data).
time, event
column names containing time and event data, respectively. Event values sould be 0 or 1.
variables
a character vector containing the names of variables of interest, for wich we want to estimate the optimal cutpoint.
minprop
the minimal proportion of observations per group.
progressbar
logical value. If TRUE, show progress bar. Progressbar is shown only, when the number of variables > 5.
x, object
an object of class surv_cutpoint
labels
labels for the levels of the resulting category.
...
other arguments. For plots, see ?ggpubr::ggpar
ggtheme
function, ggplot2 theme name. Default value is theme_classic2. Allowed values include ggplot2 official themes. see ?ggplot2::ggtheme.
bins
Number of bins for histogram. Defaults to 30.

Value

  • surv_cutpoint(): returns an object of class 'surv_cutpoint', which is a list with the following components:
    • maxstat results for each variable (see ?maxstat::maxstat)
    • cutpoint: a data frame containing the optimal cutpoint of each variable. Rows are variable names and columns are c("cutpoint", "statistic").
    • data: a data frame containing the survival data and the original data for the specified variables.
    • minprop: the minimal proportion of observations per group.
    • not_numeric: contains data for non-numeric variables, in the context where the user provided categorical variable names in the argument variables.
    Methods defined for surv_cutpoint object are summary, print and plot.
  • surv_categorize(): returns an object of class 'surv_categorize', which is a data frame containing the survival data and the categorized variables.

Functions

  • surv_cutpoint: Determine the optimal cutpoint for each variable using 'maxstat'
  • surv_categorize: Divide each variable values based on the cutpoint returned by surv_cutpoint().

Examples

Run this code
# 0. Load some data
data(myeloma)
head(myeloma)

# 1. Determine the optimal cutpoint of variables
res.cut <- surv_cutpoint(myeloma, time = "time", event = "event",
   variables = c("DEPDC1", "WHSC1", "CRIM1"))

summary(res.cut)

# 2. Plot cutpoint for DEPDC1
# palette = "npg" (nature publishing group), see ?ggpubr::ggpar
plot(res.cut, "DEPDC1", palette = "npg")

# 3. Categorize variables
res.cat <- surv_categorize(res.cut)
head(res.cat)

# 4. Fit survival curves and visualize
library("survival")
fit <- survfit(Surv(time, event) ~DEPDC1, data = res.cat)
ggsurvplot(fit, risk.table = TRUE, conf.int = TRUE)

Run the code above in your browser using DataLab