comp.risk: Competings Risks Regression

Description

Fits a semiparametric model for the cause-specific quantities : $$P(T < t, cause=1 | x,z) = P_1(t,x,z) = h( g(t,x,z) )$$ for a known link-function $h()$ and known prediction-function $g(t,x,z)$ for the probability of dying from cause 1 in a situation with competing causes of death.

We consider the following models : 1) the additive model where $h(x)=1-\exp(-x)$ and $$g(t,x,z) = x^T A(t) + (diag(t^p) z)^T \beta$$ 2) the proportional setting that includes the Fine & Gray (FG) "prop" model and some extensions where $h(x)=1-\exp(-\exp(x))$ and $$g(t,x,z) = (x^T A(t) + (diag(t^p) z)^T \beta)$$ The FG model is obtained when $x=1$, but the baseline is parametrized as $\exp(A(t))$.

The "fg" model is a different parametrization that contains the FG model, where $h(x)=1-\exp(-x)$ and $$g(t,x,z) = (x^T A(t)) \exp((diag(t^p) z)^T \beta)$$ The FG model is obtained when $x=1$.

3) a "logistic" model where $h(x)=\exp(x)/( 1+\exp(x))$ and $$g(t,x,z) = x^T A(t) + (diag(t^p) z)^T \beta$$

The "logistic2" is $$P_1(t,x,z) = x^T A(t) exp((diag(t^p) z)^T \beta)/ (1+ x^T A(t) exp((diag(t^p) z)^T \beta))$$

4) the relative cumulative incidence function "rcif" model where $h(x)=\exp(x)$ and $$g(t,x,z) = x^T A(t) + (diag(t^p) z)^T \beta$$

The "rcif2" $$P_1(t,x,z) = (x^T A(t)) \exp((diag(t^p) z)^T \beta)$$

Where p by default is 1 for the additive model and 0 for the other models. In general p may be powers of the same length as z.

Usage

comp.risk(formula,data=sys.parent(),cause,times=NULL,Nit=50,
clusters=NULL,est=NULL,fix.gamma=0,gamma=0,n.sim=500,weighted=0,model="fg",
detail=0,interval=0.01,resample.iid=1,
cens.model="KM",cens.formula=NULL,
time.pow=NULL,time.pow.test=NULL,silent=1,conv=1e-6,
weights=NULL,max.clust=1000,n.times=50,first.time.p=0.05,
estimator=1,trunc.p=NULL,cens.weight=NULL,admin.cens=NULL,conservative=1)

Arguments

formula

a formula object, with the response on the left of a '~' operator, and the terms on the right. The response must be a survival object as returned by the `Event' function. The status indicator is not important here. Time-invariant regressors are spec

data

a data.frame with the variables.

cause

For competing risk models specificies which cause we consider.

times

specifies the times at which the estimator is considered. Defaults to all the times where an event of interest occurs, with the first 10 percent or max 20 jump points removed for numerical stability in simulations.

Nit

number of iterations for Newton-Raphson algorithm.

clusters

specifies cluster structure, for backwards compability.

est

possible starting value for nonparametric component of model.

fix.gamma

to keep gamma fixed, possibly at 0.

gamma

starting value for constant effects.

n.sim

number of simulations in resampling.

weighted

Not implemented. To compute a variance weighted version of the test-processes used for testing time-varying effects.

model

"additive", "prop"ortional, "rcif", or "logistic".

detail

if 0 no details are printed during iterations, if 1 details are given.

interval

specifies that we only consider timepoints where the Kaplan-Meier of the censoring distribution is larger than this value.

resample.iid

to return the iid decomposition, that can be used to construct confidence bands for predictions

cens.model

specified which model to use for the ICPW, KM is Kaplan-Meier alternatively it may be "cox"

cens.formula

specifies the regression terms used for the regression model for chosen regression model. When cens.model is specified, the default is to use the same design as specified for the competing risks model.

time.pow

specifies that the power at which the time-arguments is transformed, for each of the arguments of the const() terms, default is 1 for the additive model and 0 for the proportional model.

time.pow.test

specifies that the power the time-arguments is transformed for each of the arguments of the non-const() terms. This is relevant for testing if a coefficient function is consistent with the specified form A_l(t)=beta_l t^time.pow.test(l). Default is 1 f

silent

if 0 information on convergence problems due to non-invertible derviates of scores are printed.

conv

gives convergence criterie in terms of sum of absolute change of parameters of model

weights

weights for estimating equations.

max.clust

sets the total number of i.i.d. terms in i.i.d. decompostition. This can limit the amount of memory used by coarsening the clusters. When NULL then all clusters are used. Default is 1000 to save memory and time.

first.time.p

first point for estimation is pth percentile of cause jump times.

n.times

only uses 50 points for estimation, if NULL then uses all points, subject to p.start condition.

estimator

default estimator is 1.

trunc.p

truncation weight for delayed entry, P(T > entry.time | Z_i), typically Cox model.

cens.weight

censoring weights can be given here rather than calculated using the KM, cox or aalen models.

admin.cens

censoring times for the administrative censoring

conservative

set to 0 to compute correct variances based on censoring weights, default is conservative estimates that are much quicker.

Value

returns an object of type 'comprisk'. With the following arguments:
cumcumulative timevarying regression coefficient estimates are computed within the estimation interval.
var.cumpointwise variances estimates.
gammaestimate of proportional odds parameters of model.
var.gammavariance for gamma.
scoresum of absolute value of scores.
gamma2estimate of constant effects based on the non-parametric estimate. Used for testing of constant effects.
obs.testBeq0observed absolute value of supremum of cumulative components scaled with the variance.
pval.testBeq0p-value for covariate effects based on supremum test.
obs.testBeqCobserved absolute value of supremum of difference between observed cumulative process and estimate under null of constant effect.
pval.testBeqCp-value based on resampling.
obs.testBeqC.isobserved integrated squared differences between observed cumulative and estimate under null of constant effect.
pval.testBeqC.isp-value based on resampling.
conf.bandresampling based constant to construct 95% uniform confidence bands.
B.iidlist of iid decomposition of non-parametric effects.
gamma.iidmatrix of iid decomposition of parametric effects.
test.procBeqCobserved test process for testing of time-varying effects
sim.test.procBeqC50 resample processes for for testing of time-varying effects
convinformation on convergence for time points used for estimation.

Details

Since timereg version 1.8.4. the response must be specified with the Event function instead of the Surv function and the arguments. For example, if the old code was

comp.risk(Surv(time,cause>0)~x1+x2,data=mydata,cause=mydata$cause,causeS=1)

the new code is

comp.risk(Event(time,cause)~x1+x2,data=mydata,cause=1)

Also the argument cens.code is now obsolete since cens.code is an argument of Event.

References

Scheike, Zhang and Gerds (2008), Predicting cumulative incidence probability by direct binomial regression,Biometrika, 95, 205-220.

Scheike and Zhang (2007), Flexible competing risks regression modelling and goodness of fit, LIDA, 14, 464-483.

Martinussen and Scheike (2006), Dynamic regression models for survival data, Springer.

Examples

Run this code

data(bmt); 

clust <- rep(1:204,each=2)
addclust<-comp.risk(Event(time,cause)~platelet+age+tcell+cluster(clust),data=bmt,
cause=1,resample.iid=1,n.sim=100,model="additive")
###

addclust<-comp.risk(Event(time,cause)~+1+cluster(clust),data=bmt,cause=1,
		    resample.iid=1,n.sim=100,model="additive")
pad <- predict(addclust,X=1)
plot(pad)

add<-comp.risk(Event(time,cause)~platelet+age+tcell,data=bmt,
cause=1,resample.iid=1,n.sim=100,model="additive")
summary(add)

par(mfrow=c(2,4))
plot(add); 
### plot(add,score=1) ### to plot score functions for test

ndata<-data.frame(platelet=c(1,0,0),age=c(0,1,0),tcell=c(0,0,1))
par(mfrow=c(2,3))
out<-predict(add,ndata,uniform=1,n.sim=100)
par(mfrow=c(2,2))
plot(out,multiple=0,uniform=1,col=1:3,lty=1,se=1)

add<-comp.risk(Event(time,cause)~platelet+age+tcell,data=bmt,
	       cause=1,resample.iid=0,n.sim=0,cens.model="cox",
	       cens.formula=~factor(platelet),model="additive")

out<-predict(add,ndata,se=0,uniform=0)
par(mfrow=c(2,2))
plot(out,multiple=0,se=0,uniform=0,col=1:3,lty=1)

## fits additive model with some constant effects 
add.sem<-comp.risk(Event(time,cause)~
const(platelet)+const(age)+const(tcell),data=bmt,
cause=1,resample.iid=1,n.sim=100,model="additive")
summary(add.sem)

out<-predict(add.sem,ndata,uniform=1,n.sim=100)
par(mfrow=c(2,2))
plot(out,multiple=0,uniform=1,col=1:3,lty=1,se=0)

## Fine & Gray model 
fg<-comp.risk(Event(time,cause)~
const(platelet)+const(age)+const(tcell),data=bmt,
cause=1,resample.iid=1,model="fg",n.sim=100)
summary(fg)

out<-predict(fg,ndata,uniform=1,n.sim=100)

par(mfrow=c(2,2))
plot(out,multiple=1,uniform=0,col=1:3,lty=1,se=0)

## extended model with time-varying effects
fg.npar<-comp.risk(Event(time,cause)~platelet+age+const(tcell),
data=bmt,cause=1,resample.iid=1,model="prop",n.sim=100)
summary(fg.npar); 

out<-predict(fg.npar,ndata,uniform=1,n.sim=100)
head(out$P1[,1:5]); head(out$se.P1[,1:5])

par(mfrow=c(2,2))
plot(out,multiple=1,uniform=0,col=1:3,lty=1,se=0)

## Fine & Gray model with alternative parametrization for baseline
fg2<-comp.risk(Event(time,cause)~const(platelet)+const(age)+const(tcell),data=bmt,
cause=1,resample.iid=1,model="prop",n.sim=100)
summary(fg2)

Run the code above in your browser using DataLab