Learn R Programming

survMisc (version 0.5.0)

cutp: cut point for a continuous variable in a model fit with coxph or survfit.

Description

Determine the optimal cut point for a continuous variable in a coxph or survfit model.

Usage

cutp(x, ...)

## S3 method for class 'coxph': cutp(x, ..., defCont = 3)

## S3 method for class 'survfit': cutp(x, ..., defCont = 3)

Arguments

x
A survfit or coxph object
defCont
definition of a continuous variable. If the variable has $>$ defCont unique values, it is treated as continuous and a cut point is determined.
...
Additional arguments (not implemented).

Value

  • A list of data.tables. There is one list element per continuous variable. Each has a column with possible values of the cut point (i.e. unique values of the variable), and the additional columns:
  • UThe score (log-rank) test for a model with the variable 'cut' into into those $\geq$ the cutpoint and those below.
  • QThe test statistic.
  • pThe $p$-value.
  • The tables are ordered by $p$-value, lowest first.

Details

For a cut point $\mu$, of a predictor $K$, the variable is split into two groups, those $\geq \mu$ and those $< \mu$. The score (or log-rank) statistic, $sc$, is calculated for each unique element $k$ in $K$ and uses
  • $e_i^+$the number of events
  • $n_i^+$the number at risk
in those above the cut point, respectively. The basic statistic is $$sc_k = \sum_{i=1}^D ( e_i^+ - n_i^+ \frac{e_i}{n_i} )$$ The sum is taken across times with observed events, to $D$, the largest of these. It is normalized (standardized), in the case of censoring, by finding $\sigma^2$ which is: $$\sigma^2 = \frac{1}{D - 1} \sum_i^D (1 - \sum_{j=1}^i \frac{1}{D+ 1 - j})^2$$ The test statistic is then $$Q = \frac{\max |sc_k|}{\sigma \sqrt{D-1}}$$ Under the null hypothesis that the chosen cut point does not predict survival, the distribution of $Q$ has a limiting distibution which is the supremum of the absolute value of a Brownian bridge: $$p = Pr(\sup Q \geq q) = 2 \sum_{i=1}^{\infty} (-1)^{i + 1} \exp (-2 i^2 q^2)$$

References

Contal C, O'Quigley J, 1999. An application of changepoint methods in studying the effect of age on survival in breast cancer. Computational Statistics & Data Analysis 30(3):253--70. http://dx.doi.org/10.1016/S0167-9473(98)00096-6{ ScienceDirect (paywall)}

Mandrekar JN, Mandrekar, SJ, Cha SS, 2003. Cutpoint Determination Methods in Survival Analysis using SAS. Proceedings of the 28th SAS Users Group International Conference (SUGI). Paper 261-28. http://www2.sas.com/proceedings/sugi28/261-28.pdf{ SAS (free)}

Examples

Run this code
## Mandrekar et al. above
data("bmt", package="KMsurv")
b1 <- bmt[bmt$group==1, ] # ALL patients
c1 <- coxph(Surv(t2, d3) ~ z1, data=b1) # z1=age
c1 <- cutp(c1)$z1
data.table::setorder(c1, "z1")
## [] below is used to print data.table to console
c1[]

## compare to output from survival::coxph
matrix(
    unlist(
        lapply(26:30,
               function(i) c(i, summary(coxph(Surv(t2, d3) ~ z1 >= i, data=b1))$sctest))),
    ncol=5,
    dimnames=list(c("age", "score_test", "df", "p")))
cutp(coxph(Surv(t2, d3) ~ z1, data=bmt[bmt$group==2, ]))$z1[]
cutp(coxph(Surv(t2, d3) ~ z1, data=bmt[bmt$group==3, ]))[[1]][]
## K&M. Example 8.3, pg 273-274.
data("kidtran", package="KMsurv")
k1 <- kidtran
## patients who are male and black
k2 <- k1[k1$gender==1 & k1$race==2, ]
c2 <- coxph(Surv(time, delta) ~ age, data=k2)
print(cutp(c2))
## check significance of computed value
summary(coxph(Surv(time, delta) ~ age >= 58, data=k2))
k3 <- k1[k1$gender==2 & k1$race==2, ]
c3 <- coxph(Surv(time, delta) ~ age, data=k3)
print(cutp(c3))
## doesn't apply to binary variables e.g. gender
print(cutp(coxph(Surv(time, delta) ~ age + gender, data=k1)))

Run the code above in your browser using DataLab