casebase (version 0.1.0)

absoluteRisk: Compute absolute risks using the fitted hazard function.

Description

Using the output of the function fitSmoothHazard, we can compute absolute risks by integrating the fitted hazard function over a time period and then converting this to an estimated survival for each individual.

Usage

absoluteRisk(object, ...)

# S3 method for default absoluteRisk(object, ...)

# S3 method for glm absoluteRisk(object, time, newdata, method = c("montecarlo", "numerical"), nsamp = 1000, ...)

# S3 method for CompRisk absoluteRisk(object, time, newdata, method = c("montecarlo", "numerical"), nsamp = 1000, ...)

Arguments

object

Output of function fitSmoothHazard.

...

Extra parameters. Currently these are simply ignored.

time

A vector of time points at which we should compute the absolute risks.

newdata

Optionally, a data frame in which to look for variables with which to predict. If omitted, the mean absolute risk is returned.

method

Method used for integration. Defaults to "montecarlo", which implements Monte-Carlo integration. The only other option is "numerical", which simply calls the function integrate.

nsamp

Maximal number of subdivisions (if method = "numerical") or number of sampled points (if method = "montecarlo").

Value

Returns the estimated absolute risk for the user-supplied covariate profiles. This will be stored in a 2- or 3-dimensional array, depending on the input. See details.

Details

If the user supplies the original dataset through the parameter newdata, the mean absolute risk can be computed as the average of the output vector.

In general, if time is a vector of length greater than one, the output will include a column corresponding to the provided time points. Some modifications of the time vector are done: time=0 is added, the time points are ordered, and duplicates are removed. All these modifications simplify the computations and give an output that can easily be used to plot risk curves.

On the other hand, if time corresponds to a single time point, the output does not include a column corresponding to time.

If there is no competing risk, the output is a matrix where each column corresponds to the several covariate profiles, and where each row corresponds to a time point. If there are competing risks, the output will be a 3-dimensional array, with the third dimension corresponding to the different events.

The numerical method should be good enough in most situation, but Monte Carlo integration can give more accurate results when the estimated hazard function is not smooth (e.g. when modeling with time-varying covariates). However, if there are competing risks, we strongly encourage the user to select Monte-Carlo integration, which is much faster than the numerical method. (This is due to the current implementation of the numerical method, and it may be improved in future versions.)

Examples

Run this code
# NOT RUN {
# Simulate censored survival data for two outcome types from exponential distributions
library(data.table)
set.seed(12345)
nobs <- 1000
tlim <- 20

# simulation parameters
b1 <- 200
b2 <- 50

# event type 0-censored, 1-event of interest, 2-competing event
# t observed time/endpoint
# z is a binary covariate
DT <- data.table(z=rbinom(nobs, 1, 0.5))
DT[,`:=` ("t_event" = rweibull(nobs, 1, b1),
          "t_comp" = rweibull(nobs, 1, b2))]
DT[,`:=`("event" = 1 * (t_event < t_comp) + 2 * (t_event >= t_comp),
         "time" = pmin(t_event, t_comp))]
DT[time >= tlim, `:=`("event" = 0, "time" = tlim)]

out_linear <- fitSmoothHazard(event ~ time + z, DT)
out_log <- fitSmoothHazard(event ~ log(time) + z, DT)

linear_risk <- absoluteRisk(out_linear, time = 10, newdata = data.table("z"=c(0,1)))
log_risk <- absoluteRisk(out_log, time = 10, newdata = data.table("z"=c(0,1)))
# }

Run the code above in your browser using DataLab