Learn R Programming

rstpm2 (version 1.4.5)

stpm2: Fully parametric generalised survival model

Description

This implements the generalised survival model g(S(t|x)) = eta, where g is a link function, S is survival, t is time, x are covariates and eta is a linear predictor. The main model assumption is that the time effects in the linear predictor are smooth. This extends the class of flexible parametric survival models developed by Royston and colleagues. The model has been extended to include relative survival, Gamma frailties and normal random effects.

Usage

stpm2(formula, data, 
      smooth.formula = NULL, smooth.args = NULL,
      df = 3, cure = FALSE, logH.args = NULL,
      logH.formula = NULL, tvc = NULL, tvc.formula =
      NULL, control = list(),
      init = NULL, coxph.strata = NULL,
      coxph.formula = NULL, weights = NULL,
      robust = FALSE, baseoff = FALSE, bhazard = NULL,
      bhazinit = NULL,
      timeVar = "", time0Var = "", use.gr = NULL,
      optimiser=NULL, log.time.transform=TRUE,
      reltol=NULL, trace = NULL,
      link.type=c("PH","PO","probit","AH","AO"), theta.AO=0, 
      frailty = !is.null(cluster) & !robust, cluster = NULL, logtheta=NULL, nodes=NULL,
      RandDist=c("Gamma","LogN"), recurrent = FALSE, adaptive=NULL,
      maxkappa=NULL, Z=~1,
      contrasts = NULL,
      subset = NULL, robust_initial = NULL, ...)

Arguments

formula

a formula object, with the response on the left of a ~ operator, and the regression terms (excluding time) on the right. The response must be a survival object as returned by the Surv function. The terms should include linear terms for any time-varying coefficients. [required]

data

a data.frame in which to interpret the variables named in the formula argument. [at present: required]

smooth.formula

a formula for describing the time effects for the linear predictor, including the baseline and the time-dependent effects (default=NULL). Only one of df, smooth.formula, smooth.args, logH.args or logH.formula is required. The default model is equal to nsx(log(time),df=3).

smooth.args

a list describing the arguments for the nsx function for modelling the baseline time effect on the linear predictor scale (default=NULL). Use this or smooth.formula for changing the knot placement and specifying cure models.

df

an integer that describes the degrees of freedom for the ns function for modelling the baseline log-cumulative hazard (default=3).

logH.args

as per smooth.args. Deprecated.

logH.formula

as per smooth.formula. Deprecated.

tvc

a list with the names of the time-varying coefficients and the degrees of freedom (e.g. tvc=list(x=3) specifies x as a time-varying coefficient with 3 degrees of freedom).

tvc.formula

a formula for describing the time-varying coefficients. If a time-varying coefficient is being model, then only one of tvc and tvc.formula is required.

bhazard

a vector for the background hazard for relative survival estimation. At present, this does not use data and it is required for all individuals - although it is only used at the event times.

bhazinit

scalar used to adjust the background cumulative hazards for calculating initial values. Default=0.1. Deprecated: the preferred approach is to set using control.

control

multiple arguments passed to stpm2.control.

init

init should either be FALSE, such that initial values will be determined using Cox regression, or a numeric vector of initial values.

coxph.strata

variable in the data argument for stratification of the coxph model fit for estimating initial values.

coxph.formula

additional formula used to improve the fitting of initial values [optional and rarely used].

weights

an optional vector of 'prior weights' to be used in the fitting process. Should be NULL or a numeric vector.

robust

Boolean used to determine whether to use a robust variance estimator.

baseoff

Boolean used to determine whether fully define the model using tvc.formula rather than combining logH.formula and tvc.formula

timeVar

variable defining the time variable. By default, this is determined from the survival object, however this may be ambiguous if two variables define the time

contrasts

an optional list. See the contrasts.arg of model.matrix.default.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

cure

logical for whether to estimate a cure model.

time0Var

string variable to determine the entry variable; useful for when more than one data variable is used in the entry time.

use.gr

logical indicating whether to use gradients in the calculation. Deprecated: the preferred approach is to set using control.

optimiser

select which optimiser is used. Default: "BFGS". Deprecated: the preferred approach is to set using control.

log.time.transform

should a log-transformation be used for calculating the derivative of the design matrix with respect to time? (default=TRUE)

link.type

type of link function. For "PH" (generalised proportional hazards), g(S)=log(-log(S)); for "PO" (generalised proportional odds), g(S)=-logit(S); for "probit" (generalised probit), g(S)=-probit(S); for "AH" (generalised additive hazards), g(S)=-log(S); for "AO" (generalised Aranda-Ordaz), g(S)=log((S^(-theta.AO)-1)/theta.AO).

theta.AO

theta parameter for the Aranda-Ordaz link type.

reltol

relative tolerance for the model convergence. Deprecated: the preferred approach is to set using control.

trace

integer for providing trace information. Default=0. Deprecated: the preferred approach is to set using control.

frailty

logical for whether to fit a shared frailty model

cluster

string for the data variable that determines the cluster for the frailty

nodes

number of integration points for Gaussian quadrature. Default=9. Deprecated: the preferred approach is to set using control.

RandDist

type of distribution for the random effect or frailty

recurrent

logical for whether clustered, left truncated data are recurrent or for first event (where the latter requires an adjustment for the frailties or random effects)

logtheta

initial value for log-theta used in the gamma or log-normal shared frailty model (defaults to an initial value from a Cox model fit)

adaptive

logical for whether to use adaptive or non-adaptive quadrature, Default=TRUE. Deprecated: the preferred approach is to set using control.

maxkappa

double float value for the maximum value of the weight used in the constraint. Default=1000. Deprecated: the preferred approach is to set using control.

Z

formula for the design matrix for the random effects

robust_initial

logical for whether to use Nelder-Mead to find initial values (max 50 iterations). This is useful for ill-posed initial values. Default=FALSE. Deprecated: the preferred approach is to set using control.

additional arguments to be passed to the mle2 .

Value

An stpm2-class object that inherits from mle2-class.

Details

The implementation extends the mle2 object from the bbmle package. The model inherits all of the methods from the mle2 class.

The default linear predictor includes a time effect modelled using natural splines for log(time) with three degrees of freedom.

Examples

Run this code
# NOT RUN {
data(brcancer)
summary(fit <- stpm2(Surv(rectime,censrec==1)~hormon,data=brcancer,df=3))

## some predictions
head(predict(fit,se.fit=TRUE,type="surv"))
head(predict(fit,se.fit=TRUE,type="hazard"))

## some plots
plot(fit,newdata=data.frame(hormon=0),type="hazard")
plot(fit,newdata=data.frame(hormon=0),type="surv")

## the same model using logH.formula
summary(stpm2(Surv(rectime,censrec==1)~hormon,data=brcancer,logH.formula=~ns(log(rectime),df=3)))


## time-varying coefficient
summary(fit.tvc <- stpm2(Surv(rectime,censrec==1)~hormon,data=brcancer,df=3,
                     tvc=list(hormon=3)))
anova(fit,fit.tvc) # compare with and without tvc

## some more plots
plot(fit.tvc,newdata=data.frame(hormon=0),type="hr",var="hormon", ylim=c(0,2))
                                        # no lines method: use add=TRUE
plot(fit.tvc,newdata=data.frame(hormon=1),type="hr",var="hormon",
     add=TRUE,ci=FALSE,line.col=2)

plot(fit.tvc,newdata=data.frame(hormon=0),type="sdiff",var="hormon")

plot(fit.tvc,newdata=data.frame(hormon=0),type="hdiff",var="hormon")

plot(fit.tvc,newdata=data.frame(hormon=0),type="hazard")
plot(fit.tvc,newdata=data.frame(hormon=1),type="hazard",line.col=2,ci=FALSE,add=TRUE)

## compare number of knots
hormon0 <- data.frame(hormon=0)
plot(fit,type="hazard",newdata=hormon0)
AIC(fit)
for (df in 4:6) {
    fit.new <- stpm2(Surv(rectime,censrec==1)~hormon,data=brcancer,df=df)
    plot(fit.new,type="hazard",newdata=hormon0,add=TRUE,ci=FALSE,line.col=df)
    print(AIC(fit.new))
}


# }

Run the code above in your browser using DataLab