Computes an estimate of the hazard for censoring using either glm or
SuperLearner based on log-likelihood loss. The function then computes
the censoring survival distribution based on these estimates. The structure
of the function is specific to how it is called within survtmle. In
particular, dataList must have a very specific structure for this
function to run properly. The list should consist of data.frame
objects. The first will have the number of rows for each observation equal to
the ftime corresponding to that observation. The subsequent entries
will have t0 rows for each observation and will set trt column
equal to each value of trtOfInterest in turn. One of these columns
must be named C that is a counting process for the right-censoring
variable. The function will fit a regression with C as the outcome and
functions of trt and names(adjustVars) as specified by
glm.ctime or SL.ctime as predictors.
estimateCensoring(dataList, adjustVars, t0, SL.ctime = NULL,
glm.ctime = NULL, glm.family, returnModels = FALSE, verbose = TRUE,
gtol = 0.001, ...)A list of data.frame objects as described in
?makeDataList.
Object of class data.frame that contains the
variables to adjust for in the regression.
The timepoint at which survtmle was called to evaluate.
Needed only because the naming convention for the regression if
t == t0 is different than if t != t0.
A character vector or list specification to be passed to the
SL.library argument in the call to SuperLearner for the
outcome regression (either cause-specific hazards or conditional mean).
See ?SuperLearner for more information on how to specify valid
SuperLearner libraries. It is expected that the wrappers used
in the library will play nicely with the input variables, which will
be called "trt" and names(adjustVars).
A character specification of the right-hand side of the
equation passed to the formula option of a call to glm
for the outcome regression (either cause-specific hazards or
conditional mean). Ignored if SL.ctime != NULL. Use "trt"
to specify the treatment in this formula (see examples). The formula
can additionally include any variables found in
names(adjustVars).
The type of regression to be performed if fitting GLMs in
the estimation and fluctuation procedures. The default is "binomial"
for logistic regression. Only change this from the default if there
are justifications that are well understood. This is inherited from
the calling function (either mean_tmle or hazard_tmle).
A boolean indicating whether to return the
SuperLearner or glm objects used to estimate the
nuisance parameters. Must be set to TRUE if the user plans to
use calls to timepoints to obtain estimates at times other than
t0. See ?timepoints for more information.
A boolean indicating whether the function should print messages to indicate progress.
The truncation level of predicted censoring survival to handle positivity violations.
Other arguments. Not currently used.
The function returns a list that is exactly the same as the input
dataList, but with a column named G_dC added to it,
which is the estimated conditional survival distribution for the
censoring variable evaluated at the each of the rows of each
data.frame in dataList.