mlr_graphs_survtoregr: Survival to Regression Reduction Pipeline

Description

Wrapper around multiple PipeOps to help in creation of complex survival to reduction methods. Three reductions are currently implemented, see details.

Usage

pipeline_survtoregr(
  method = 1,
  regr_learner = lrn("regr.featureless"),
  distrcompose = TRUE,
  distr_estimator = lrn("surv.kaplan"),
  regr_se_learner = NULL,
  surv_learner = lrn("surv.coxph"),
  survregr_params = list(method = "ipcw", estimator = "kaplan", alpha = 1),
  distrcompose_params = list(form = "aft"),
  probregr_params = list(dist = "Normal"),
  learnercv_params = list(resampling.method = "insample"),
  graph_learner = FALSE
)

Arguments

method

integer(1) Reduction method to use, corresponds to those in details. Default is 1.

regr_learner

LearnerRegr Regression learner to fit to the transformed TaskRegr. If regr_se_learner is NULL in method 2, then regr_learner must have se predict_type.

distrcompose

logical(1) For methods 1 and 3 if TRUE (default) then PipeOpDistrCompositor is utilised to transform the deterministic predictions to a survival distribution.

distr_estimator

LearnerSurv For methods 1 and 3 if distrcompose = TRUE then specifies the learner to estimate the baseline hazard, must have predict_type distr.

regr_se_learner

LearnerRegr For method 2 if regr_learner is not used to predict the se then a LearnerRegr with se predict_type must be provided.

surv_learner

LearnerSurv For method 3, a LearnerSurv with lp predict type to estimate linear predictors.

survregr_params

list() Parameters passed to PipeOpTaskSurvRegr, default are survival to regression transformation via ipcw, with weighting determined by Kaplan-Meier and no additional penalty for censoring.

distrcompose_params

list() Parameters passed to PipeOpDistrCompositor, default is accelerated failure time model form.

probregr_params

list() Parameters passed to PipeOpProbregrCompositor, default is Normal distribution for composition.

learnercv_params

list() Parameters passed to PipeOpLearnerCV, default is to use insampling.

graph_learner

logical(1) If TRUE returns wraps the Graph as a GraphLearner otherwise (default) returns as a Graph.

Value

mlr3pipelines::Graph or mlr3pipelines::GraphLearner

Details

Three reduction strategies are implemented, these are:

Survival to Deterministic Regression A
1. PipeOpTaskSurvRegr Converts TaskSurv to TaskRegr.
2. A LearnerRegr is fit and predicted on the new TaskRegr.
3. PipeOpPredRegrSurv transforms the resulting PredictionRegr to PredictionSurv.
4. Optionally: PipeOpDistrCompositor is used to compose a distr predict_type from the predicted response predict_type.
Survival to Probabilistic Regression
1. PipeOpTaskSurvRegr Converts TaskSurv to TaskRegr.
2. A LearnerRegr is fit on the new TaskRegr to predict response, optionally a second LearnerRegr can be fit to predict se.
3. PipeOpProbregrCompositor composes a distr prediction from the learner(s).
4. PipeOpPredRegrSurv transforms the resulting PredictionRegr to PredictionSurv.
Survival to Deterministic Regression B
1. PipeOpLearnerCV cross-validates and makes predictions from a linear LearnerSurv with lp predict type on the original TaskSurv.
2. PipeOpTaskSurvRegr transforms the lp predictions into the target of a TaskRegr with the same features as the original TaskSurv.
3. A LearnerRegr is fit and predicted on the new TaskRegr.
4. PipeOpPredRegrSurv transforms the resulting PredictionRegr to PredictionSurv.
5. Optionally: PipeOpDistrCompositor is used to compose a distr predict_type from the predicted lp predict_type.

Interpretation:

Once a dataset has censoring removed (by a given method) then a regression learner can predict the survival time as the response.
This is a very similar reduction to the first method with the main difference being the distribution composition. In the first case this is composed in a survival framework by assuming a linear model form and baseline hazard estimator, in the second case the composition is in a regression framework. The latter case could result in problematic negative predictions and should therefore be interpreted with caution, however a wider choice of distributions makes it a more flexible composition.
This is a rarer use-case that bypasses censoring not be removing it but instead by first predicting the linear predictor from a survival model and fitting a regression model on these predictions. The resulting regression predictions can then be viewed as the linear predictors of the new data, which can ultimately be composed to a distribution.

Examples

Run this code

# NOT RUN {
if (requireNamespace("mlr3pipelines", quietly = TRUE)) {
  library("mlr3")
  library("mlr3pipelines")

  task = tsk("rats")

  # method 1 with censoring deletion, compose to distribution
  pipe = ppl(
    "survtoregr",
    method = 1,
    regr_learner = lrn("regr.featureless"),
    distrcompose = TRUE,
    survregr_params = list(method = "delete")
  )
  pipe$train(task)
  pipe$predict(task)

  # method 2 with censoring imputation (mrl), one regr learner
  pipe = ppl(
    "survtoregr",
    method = 2,
    regr_learner = lrn("regr.featureless", predict_type = "se"),
    survregr_params = list(method = "mrl")
  )
  pipe$train(task)
  pipe$predict(task)

  # method 3 with censoring omission and no composition, insample resampling
  pipe = ppl(
    "survtoregr",
    method = 3,
    regr_learner = lrn("regr.featureless"),
    distrcompose = FALSE,
    surv_learner = lrn("surv.coxph"),
    survregr_params = list(method = "omission")
  )
  pipe$train(task)
  pipe$predict(task)
}
# }

Run the code above in your browser using DataLab