# cox.pht

0th

Percentile

##### Additive Cox proportional hazard models with time varying covariates

The cox.ph family only allows one set of covariate values per subject. If each subject has several time varying covariate measurements then it is still possible to fit a proportional hazards regression model, via an equivalent Poisson model. The recipe is provided by Whitehead (1980) and is equally valid in the smooth additive case. Its drawback is that the equivalent Poisson dataset can be quite large.

The trick is to generate an artificial Poisson observation for each subject in the risk set at each non-censored event time. The corresponding covariate values for each subject are whatever they are at the event time, while the Poisson response is zero for all subjects except those experiencing the event at that time (this corresponds to Peto's correction for ties). The linear predictor for the model must include an intercept for each event time (the cumulative sum of the exponential of these is the Breslow estimate of the baseline hazard).

Below is some example code employing this trick for the pbcseq data from the survival package. It uses bam for fitting with the discrete=TRUE option for efficiency: there is some approximation involved in doing this, and the exact equivalent to what is done in cox.ph is rather obtained by using gam with method="REML" (taking some 14 times the computational time for the example below).

The function tdpois in the example code uses crude piecewise constant interpolation for the covariates, in which the covariate value at an event time is taken to be whatever it was the previous time that it was measured. Obviously more sophisticated interpolation schemes might be preferable.

Keywords
models, regression
##### References

Whitehead (1980) Fitting Cox's regression model to survival data using GLIM. Applied Statistics 29(3):268-275

• cox.pht
##### Examples
# NOT RUN {
require(mgcv);require(survival)

## First define functions for producing Poisson model data frame

app <- function(x,t,to) {
## wrapper to approx for calling from apply...
y <- if (sum(!is.na(x))<1) rep(NA,length(to)) else
approx(t,x,to,method="constant",rule=2)$y if (is.factor(x)) factor(levels(x)[y],levels=levels(x)) else y } ## app tdpois <- function(dat,event="z",et="futime",t="day",status="status1", id="id") { ## dat is data frame. id is patient id; et is event time; t is ## observation time; status is 1 for death 0 otherwise; ## event is name for Poisson response. if (event %in% names(dat)) warning("event name in use") require(utils) ## for progress bar te <- sort(unique(dat[[et]][dat[[status]]==1])) ## event times sid <- unique(dat[[id]]) prg <- txtProgressBar(min = 0, max = length(sid), initial = 0, char = "=",width = NA, title="Progress", style = 3) ## create dataframe for poisson model data dat[[event]] <- 0; start <- 1 dap <- dat[rep(1:length(sid),length(te)),] for (i in 1:length(sid)) { ## work through patients di <- dat[dat[[id]]==sid[i],] ## ith patient's data tr <- te[te <= di[[et]][1]] ## times required for this patient ## Now do the interpolation of covariates to event times... um <- data.frame(lapply(X=di,FUN=app,t=di[[t]],to=tr)) ## Mark the actual event... if (um[[et]][1]==max(tr)&&um[[status]]==1) um[[event]][nrow(um)] <- 1 um[[et]] <- tr ## reset time to relevant event times dap[start:(start-1+nrow(um)),] <- um ## copy to dap start <- start + nrow(um) setTxtProgressBar(prg, i) } close(prg) dap[1:(start-1),] } ## tdpois ## The following takes too long for CRAN checking ## (but typically takes a minute or less). # } # NOT RUN { ## Convert pbcseq to equivalent Poisson form... pbcseq$status1 <- as.numeric(pbcseq$status==2) ## death indicator pb <- tdpois(pbcseq) ## conversion pb$tf <- factor(pb$futime) ## add factor for event time ## Fit Poisson model... b <- bam(z ~ tf - 1 + sex + trt + s(sqrt(protime)) + s(platelet)+ s(age)+ s(bili)+s(albumin), family=poisson,data=pb,discrete=TRUE,nthreads=2) par(mfrow=c(2,3)) plot(b,scale=0) ## compute residuals... chaz <- tapply(fitted(b),pb$id,sum) ## cum haz by subject
d <- tapply(pb$z,pb$id,sum) ## censoring indicator
mrsd <- d - chaz ## Martingale
drsd <- sign(mrsd)*sqrt(-2*(mrsd + d*log(chaz))) ## deviance

## plot survivor function and s.e. band for subject 25
te <- sort(unique(pb$futime)) ## event times di <- pbcseq[pbcseq$id==25,] ## data for subject 25
pd <- data.frame(lapply(X=di,FUN=app,t=di$day,to=te)) ## interpolate to te pd$tf <- factor(te)
X <- predict(b,newdata=pd,type="lpmatrix")
eta <- drop(X%*%coef(b)); H <- cumsum(exp(eta))
J <- apply(exp(eta)*X,2,cumsum)
se <- diag(J%*%vcov(b)%*%t(J))^.5
plot(stepfun(te,c(1,exp(-H))),do.points=FALSE,ylim=c(0.7,1),
ylab="S(t)",xlab="t (days)",main="",lwd=2)
lines(stepfun(te,c(1,exp(-H+se))),do.points=FALSE)
lines(stepfun(te,c(1,exp(-H-se))),do.points=FALSE)
rug(pbcseq$day[pbcseq$id==25]) ## measurement times
# }

Documentation reproduced from package mgcv, version 1.8-23, License: GPL (>= 2)

### Community examples

Looks like there are no examples yet.