predictRisk: Extrating predicting risks from regression models

Description

Extract event probabilities from fitted regression models and machine learning objects. The function predictRisk is a generic function, meaning that it invokes specifically designed functions depending on the 'class' of the first argument. See predictRisk.

Usage

predictRisk(object, newdata, ...)
# S3 method for default
predictRisk(object, newdata, times, cause, ...)
# S3 method for double
predictRisk(object, newdata, times, cause, ...)
# S3 method for integer
predictRisk(object, newdata, times, cause, ...)
# S3 method for factor
predictRisk(object, newdata, times, cause, ...)
# S3 method for numeric
predictRisk(object, newdata, times, cause, ...)
# S3 method for glm
predictRisk(object, newdata, iid = FALSE, average.iid = FALSE, ...)
# S3 method for multinom
predictRisk(
  object,
  newdata,
  iid = FALSE,
  average.iid = FALSE,
  cause = NULL,
  ...
)
# S3 method for formula
predictRisk(object, newdata, ...)
# S3 method for BinaryTree
predictRisk(object, newdata, ...)
# S3 method for lrm
predictRisk(object, newdata, ...)
# S3 method for rpart
predictRisk(object, newdata, ...)
# S3 method for randomForest
predictRisk(object, newdata, ...)
# S3 method for matrix
predictRisk(object, newdata, times, cause, ...)
# S3 method for aalen
predictRisk(object, newdata, times, ...)
# S3 method for cox.aalen
predictRisk(object, newdata, times, ...)
# S3 method for comprisk
predictRisk(object, newdata, times, ...)
# S3 method for coxph
predictRisk(
  object,
  newdata,
  times,
  product.limit = FALSE,
  diag = FALSE,
  iid = FALSE,
  average.iid = FALSE,
  ...
)
# S3 method for coxphTD
predictRisk(object, newdata, times, landmark, ...)
# S3 method for CSCTD
predictRisk(object, newdata, times, cause, landmark, ...)
# S3 method for coxph.penal
predictRisk(object, newdata, times, ...)
# S3 method for cph
predictRisk(
  object,
  newdata,
  times,
  product.limit = FALSE,
  diag = FALSE,
  iid = FALSE,
  average.iid = FALSE,
  ...
)
# S3 method for selectCox
predictRisk(object, newdata, times, ...)
# S3 method for prodlim
predictRisk(object, newdata, times, cause, ...)
# S3 method for survfit
predictRisk(object, newdata, times, ...)
# S3 method for psm
predictRisk(object, newdata, times, ...)
# S3 method for ranger
predictRisk(object, newdata, times, cause, ...)
# S3 method for rfsrc
predictRisk(object, newdata, times, cause, ...)
# S3 method for FGR
predictRisk(object, newdata, times, cause, ...)
# S3 method for riskRegression
predictRisk(object, newdata, times, cause, ...)
# S3 method for ARR
predictRisk(object, newdata, times, cause, ...)
# S3 method for CauseSpecificCox
predictRisk(
  object,
  newdata,
  times,
  cause,
  product.limit = TRUE,
  diag = FALSE,
  iid = FALSE,
  average.iid = FALSE,
  truncate = FALSE,
  ...
)
# S3 method for penfitS3
predictRisk(object, newdata, times, ...)
# S3 method for SuperPredictor
predictRisk(object, newdata, ...)
# S3 method for gbm
predictRisk(object, newdata, times, ...)
# S3 method for flexsurvreg
predictRisk(object, newdata, times, ...)
# S3 method for Hal9001
predictRisk(object, newdata, times, cause, ...)
# S3 method for GLMnet
predictRisk(object, newdata, times = NA, ...)
# S3 method for singleEventCB
predictRisk(object, newdata, times, cause, ...)
# S3 method for CoxConfidential
predictRisk(object, newdata, ...)
# S3 method for wglm
predictRisk(
  object,
  newdata,
  times = NULL,
  product.limit = FALSE,
  diag = FALSE,
  iid = FALSE,
  average.iid = FALSE,
  ...
)

Value

For binary outcome a vector with predicted risks. For survival outcome with and without competing risks a matrix with as many rows as NROW(newdata) and as many columns as length(times). Each entry is a probability and in rows the values should be increasing.

Arguments

object: A fitted model from which to extract predicted event probabilities.
newdata: A data frame containing predictor variable combinations for which to compute predicted event probabilities.
...: Additional arguments that are passed on to the current method.
times: A vector of times in the range of the response variable, for which the cumulative incidences event probabilities are computed.
cause: Identifies the cause of interest among the competing events.
iid: Should the iid decomposition be output using an attribute?
average.iid: Should the average iid decomposition be output using an attribute?
product.limit: If TRUE the survival is computed using the product limit estimator. Otherwise the exponential approximation is used (i.e. exp(-cumulative hazard)).
diag: when FALSE the hazard/cumlative hazard/survival for all observations at all times is computed, otherwise it is only computed for the i-th observation at the i-th time.
landmark: The starting time for the computation of the cumulative risk.
truncate: If TRUE truncates the predicted risks to be in the range [0, 1]. For now only implemented for the Cause Specific Cox model.

Author

Thomas A. Gerds tag@biostat.ku.dk

Details

In uncensored binary outcome data there is no need to choose a time point.

When operating on models for survival analysis (without competing risks) the function still predicts the risk, as 1 - S(t|X) where S(t|X) is survival chance of a subject characterized by X.

When there are competing risks (and the data are right censored) one needs to specify both the time horizon for prediction (can be a vector) and the cause of the event. The function then extracts the absolute risks F_c(t|X) aka the cumulative incidence of an event of type/cause c until time t for a subject characterized by X. Depending on the model it may or not be possible to predict the risk of all causes in a competing risks setting. For example. a cause-specific Cox (CSC) object allows to predict both cases whereas a Fine-Gray regression model (FGR) is specific to one of the causes.

Examples

Run this code

## binary outcome
library(rms)
set.seed(7)
d <- sampleData(80,outcome="binary")
nd <- sampleData(80,outcome="binary")
fit <- lrm(Y~X1+X8,data=d)
predictRisk(fit,newdata=nd)

## survival outcome
# generate survival data
library(prodlim)
set.seed(100)
d <- sampleData(100,outcome="survival")
d[,X1:=as.numeric(as.character(X1))]
d[,X2:=as.numeric(as.character(X2))]
# then fit a Cox model
library(rms)
cphmodel <- cph(Surv(time,event)~X1+X2,data=d,surv=TRUE,x=TRUE,y=TRUE)
# or via survival
library(survival)
coxphmodel <- coxph(Surv(time,event)~X1+X2,data=d,x=TRUE,y=TRUE)

# Extract predicted survival probabilities
# at selected time-points:
ttt <- quantile(d$time)
# for selected predictor values:
ndat <- data.frame(X1=c(0.25,0.25,-0.05,0.05),X2=c(0,1,0,1))
# as follows
predictRisk(cphmodel,newdata=ndat,times=ttt)
predictRisk(coxphmodel,newdata=ndat,times=ttt)

## simulate learning and validation data
set.seed(10)
learndat <- sampleData(80,outcome="survival")
valdat <- sampleData(10,outcome="survival")
## use the learning data to fit a Cox model
library(survival)
fitCox <- coxph(Surv(time,event)~X6+X2,data=learndat,x=TRUE,y=TRUE)
## suppose we want to predict the survival probabilities for all subjects
## in the validation data at the following time points:
## 0, 1, 2, 3, 4
psurv <- predictRisk(fitCox,newdata=valdat,times=seq(0,4,1))
## This is a matrix with event probabilities (1-survival)
## one column for each of the 5 time points
## one row for each validation set individual

## competing risks
library(survival)
library(riskRegression)
library(prodlim)
set.seed(8)
train <- sampleData(80)
test <- sampleData(10)
cox.fit  <- CSC(Hist(time,event)~X1+X6,data=train,cause=1)
predictRisk(cox.fit,newdata=test,times=seq(1:10),cause=1)

## with strata
cox.fit2  <- CSC(list(Hist(time,event)~strata(X1)+X6,
                      Hist(time,cause)~X1+X6),data=train)
predictRisk(cox.fit2,newdata=test,times=seq(1:10),cause=1)

Run the code above in your browser using DataLab