Last chance! 50% off unlimited learning
Sale ends in
This function computes an estimate of the cause-specific hazard functions
over all times using either glm
or SuperLearner
. The structure
of the function is specific to how it is called within hazard_tmle
. In
particular, dataList
must have a very specific structure for this
function to run properly. The list should consist of data.frame
objects. The first will have the number of rows for each observation equal to
the ftime
corresponding to that observation. The subsequent entries
will have t0
rows for each observation and will set trt
column
equal to each value of trtOfInterest
in turn. The function uses the
first entry in dataList
to iteratively fit hazard regression models
for each cause of failure. Thus, this data.frame
needs to have a
column called Nj
for each value of j in J
. The first fit
estimates the hazard of min(J)
, while subsequent fits estimate the
pseudo-hazard of all other values of j, where pseudo-hazard is used to mean
the probability of a failure due to type j at a particular timepoint given
no failure of any type at any previous timepoint AND no failure due to type
k < j
at a particular timepoint. The hazard estimates of causes j'
can then be used to map this pseudo-hazard back into the hazard at a
particular time. This is nothing more than the re-framing of a conditional
multinomial probability into a series of conditional binomial probabilities.
This structure ensures that no strata have estimated hazards that sum to more
than one over all possible causes of failure at a particular timepoint.
estimateHazards(dataList, J, adjustVars, SL.ftime = NULL,
glm.ftime = NULL, glm.family, returnModels, bounds, verbose, ...)
A list of data.frame
objects.
Numeric vector indicating the labels of all causes of failure.
Object of class data.frame
that contains the
variables to adjust for in the regression.
A character vector or list specification to be passed to the
SL.library
argument in the call to SuperLearner
for the
outcome regression (either cause-specific hazards or conditional mean).
See ?SuperLearner
for more information on how to specify valid
SuperLearner
libraries. It is expected that the wrappers used
in the library will play nicely with the input variables, which will
be called "trt"
and names(adjustVars)
.
A character specification of the right-hand side of the
equation passed to the formula
option of a call to glm
for the outcome regression (either using cause-specific hazards or
conditional mean). Ignored if SL.ftime != NULL
. Use "trt"
to specify the treatment in this formula (see examples). The formula
can additionally include any variables found in
names(adjustVars)
.
The type of regression to be performed if fitting GLMs in
the estimation and fluctuation procedures. The default is "binomial"
for logistic regression. Only change this from the default if there
are justifications that are well understood. This is inherited from
the calling function (either mean_tmle
or hazard_tmle
).
A boolean indicating whether to return the
SuperLearner
or glm
objects used to estimate the
nuisance parameters. Must be set to TRUE
if the user plans to
use calls to timepoints
to obtain estimates at times other than
t0
. See ?timepoints
for more information.
A list of bounds... TODO: Add more description here.
A boolean indicating whether the function should print messages to indicate progress.
Other arguments. Not currently used.
The function returns a list that is exactly the same as the input
dataList
, but with additional columns corresponding to the
hazard, pseudo-hazard, and the total hazard for summed over all
causes k < j
.