Learn R Programming

mlr3proba

Package website: release | dev

Probabilistic Supervised Learning for mlr3.

What is mlr3proba ?

mlr3proba is a machine learning toolkit for making probabilistic predictions within the mlr3 ecosystem. It currently supports the following tasks:

  • Probabilistic supervised regression - Supervised regression with a predictive distribution as the return type.
  • Predictive survival analysis - Survival analysis where individual predictive hazards can be queried. This is equivalent to probabilistic supervised regression with censored observations.
  • Unconditional distribution estimation, where the distribution is returned. Sub-cases are density estimation and unconditional survival estimation.

Key features of mlr3proba are

  • A unified fit/predict model interface to any probabilistic predictive model (frequentist, Bayesian, or other)
  • Pipeline/model composition
  • Task reduction strategies
  • Domain-agnostic evaluation workflows using task specific algorithmic performance measures.

mlr3proba makes use of the distr6 probability distribution interface as its probabilistic predictive return type.

Feature Overview

The current mlr3proba release focuses on survival analysis, and contains:

  • Task frameworks for survival analysis (TaskSurv)
  • A comprehensive selection of 17 predictive survival learners
  • A comprehensive selection of 21 performance measures for predictive survival learners, with respect to prognostic index (continuous rank) prediction, and probabilistic (distribution) prediction
  • PipeOps integrated with mlr3pipelines, for basic pipeline building, and reduction/composition strategies using linear predictors and baseline hazards.

Roadmap

The vision of mlr3proba is to provide comprehensive machine learning functionality to the mlr3 ecosystem for continuous probabilistic return types.

The lifecycle of the survival task and features are considered maturing and any major changes are unlikely.

The density and probabilistic supervised regression tasks are currently in the early stages of development. Task frameworks have been drawn up, but may not be stable; learners need to be interfaced, and contributions are very welcome (see issues).

Installation

Install the last release from CRAN:

install.packages("mlr3proba")

Install the development version from GitHub:

remotes::install_github("mlr-org/mlr3proba")

Learners

Core learners are implemented in mlr3proba, recommended common learners are implemented in mlr3learners, and many more are implemented in mlr3extralearners. Use the interactive search table to search for available learners and see the learner status page for their live status.

Measures

For density estimation only the log-loss is currently implemented, for survival analysis, the following measures are implemented:

IDMeasurePackage
surv.calib_alphavan Houwelingen’s Alpha Calibrationmlr3proba
surv.calib_betavan Houwelingen’s Beta Calibrationmlr3proba
surv.chambless_aucChambless and Diao’s AUCsurvAUC
surv.cindexConcordance Indexmlr3proba
surv.grafIntegrated Graf Scoremlr3proba
surv.hungAUCHung and Chiang’s AUCsurvAUC
surv.intloglossIntegrated Log Lossmlr3proba
surv.loglossLog Lossmlr3proba
surv.nagelk_r2Nagelkerke’s R2survAUC
surv.oquigley_r2O’Quigley, Xu, and Stare’s R2survAUC
surv.song_aucSong and Zhou’s AUCsurvAUC
surv.song_tnrSong and Zhou’s TNRsurvAUC
surv.song_tprSong and Zhou’s TPRsurvAUC
surv.uno_aucUno’s AUCsurvAUC
surv.uno_tnrUno’s TNRsurvAUC
surv.uno_tprUno’s TPRsurvAUC
surv.xu_r2Xu and O’Quigley’s R2survAUC

Near-Future Plans

  • Add prob predict type to TaskRegr, and associated learners/measures
  • Allow MeasureSurv to return measures at multiple time-points simultaneously
  • Continue to add survival measures and learners

Bugs, Questions, Feedback

mlr3proba is a free and open source software project that encourages participation and feedback. If you have any issues, questions, suggestions or feedback, please do not hesitate to open an “issue” about it on the GitHub page!

In case of problems / bugs, it is often helpful if you provide a “minimum working example” that showcases the behaviour (but don’t worry about this if the bug is obvious).

Similar Projects

Predecessors to this package are previous instances of survival modelling in mlr. The skpro package in the python/scikit-learn ecosystem follows a similar interface for probabilistic supervised learning and is an architectural predecessor. Several packages exist which allow probabilistic predictive modelling with a Bayesian model specific general interface, such as rjags and stan. For implementation of a few survival models and measures, a central package is survival. There does not appear to be a package that provides an architectural framework for distribution/density estimation, see this list for a review of density estimation packages in R.

Acknowledgements

Several people contributed to the building of mlr3proba. Firstly, thanks to Michel Lang for writing mlr3survival. Several learners and measures implemented in mlr3proba, as well as the prediction, task, and measure surv objects, were written initially in mlr3survival before being absorbed into mlr3proba. Secondly thanks to Franz Kiraly for major contributions towards the design of the proba-specific parts of the package, including compositors and predict types. Also for mathematical contributions towards the scoring rules implemented in the package. Finally thanks to Bernd Bischl and the rest of the mlr core team for building mlr3 and for many conversations about the design of mlr3proba.

Citing mlr3proba

If you use mlr3proba, please cite our Bioinformatics article:

@Article{,
  title = {mlr3proba: An R Package for Machine Learning in Survival Analysis},
  author = {Raphael Sonabend and Franz J Király and Andreas Bender and Bernd Bischl and Michel Lang},
  journal = {Bioinformatics},
  month = {02},
  year = {2021},
  doi = {10.1093/bioinformatics/btab039},
  issn = {1367-4803},
}

Copy Link

Version

Install

install.packages('mlr3proba')

Monthly Downloads

57

Version

0.4.9

License

LGPL-3

Issues

Pull Requests

Stars

Forks

Maintainer

Michel Lang

Last Published

April 25th, 2022

Functions in mlr3proba (0.4.9)

MeasureDens

Density Measure
PredictionDens

Prediction Object for Density
PipeOpTransformer

PipeOpTransformer
MeasureSurv

Survival Measure
PipeOpPredTransformer

PipeOpPredTransformer
PredictionSurv

Prediction Object for Survival
MeasureSurvAUC

Abstract Class for survAUC Measures
PipeOpTaskTransformer

PipeOpTaskTransformer
LearnerDens

Density Learner
LearnerSurv

Survival Learner
actg

ACTG 320 Clinical Trial Dataset
as_task_dens

Convert to a Density Task
as_prediction_surv

Convert to a Survival Prediction
cpp

Cpp functions
.surv_return

Get Survival Predict Types
assert_surv

Assert survival object
mlr_graphs_survbagging

Survival Prediction Averaging Pipeline
mlr_graphs_survaverager

Survival Prediction Averaging Pipeline
as_task_surv

Convert to a Survival Task
grace

GRACE 1000 Dataset
gbcs

German Breast Cancer Study (GBCS) Dataset
mlr3proba-package

mlr3proba: Probabilistic Supervised Learning for 'mlr3'
as_prediction_dens

Convert to a Density Prediction
mlr_graphs_crankcompositor

Estimate Survival crank Predict Type Pipeline
TaskSurv

Survival Task
TaskDens

Density Task
mlr_graphs_survtoregr

Survival to Regression Reduction Pipeline
mlr_measures_dens.logloss

Log loss Density Measure
mlr_measures_surv.calib_alpha

Van Houwelingen's Alpha Survival Measure
mlr_learners_dens.hist

Histogram Density Estimator
mlr_measures_surv.chambless_auc

Chambless and Diao's AUC Survival Measure
mlr_measures_surv.calib_beta

Van Houwelingen's Beta Survival Measure
mlr_measures_surv.cindex

Concordance Statistics Survival Measure
mlr_measures_surv.dcalib

D-Calibration Survival Measure
mlr_learners_surv.rpart

Rpart Survival Trees Survival Learner
mlr_learners_surv.kaplan

Kaplan-Meier Estimator Survival Learner
mlr_graphs_distrcompositor

Estimate Survival distr Predict Type Pipeline
mlr_learners_dens.kde

Kernel Density Estimator
mlr_learners_surv.coxph

Cox Proportional Hazards Survival Learner
mlr_graphs_probregrcompositor

Estimate Regression distr Predict Type Pipeline
mlr_measures_surv.intlogloss

Integrated Log loss Survival Measure
mlr_measures_surv.rcll

Right-Censored Log loss Survival Measure
mlr_measures_surv.mae

Mean Absolute Error Survival Measure
mlr_measures_surv.graf

Integrated Graf Score Survival Measure
mlr_measures_surv.nagelk_r2

Nagelkerke's R2 Survival Measure
mlr_measures_surv.oquigley_r2

O'Quigley, Xu, and Stare's R2 Survival Measure
mlr_measures_surv.hung_auc

Hung and Chiang's AUC Survival Measure
mlr_measures_surv.mse

Mean Squared Error Survival Measure
mlr_measures_surv.logloss

Log loss Survival Measure
mlr_measures_surv.rmse

Root Mean Squared Error Survival Measure
mlr_measures_surv.song_tpr

Song and Zhou's TPR Survival Measure
mlr_pipeops_compose_distr

PipeOpDistrCompositor
mlr_measures_surv.xu_r2

Xu and O'Quigley's R2 Survival Measure
mlr_measures_surv.song_auc

Song and Zhou's AUC Survival Measure
mlr_measures_surv.uno_tpr

Uno's TPR Survival Measure
mlr_pipeops_compose_crank

PipeOpCrankCompositor
mlr_measures_surv.song_tnr

Song and Zhou's TNR Survival Measure
mlr_measures_surv.schmid

Integrated Schmid Score Survival Measure
mlr_measures_surv.uno_tnr

Uno's TNR Survival Measure
mlr_measures_surv.uno_auc

Uno's AUC Survival Measure
mlr_pipeops_trafotask_regrsurv

PipeOpTaskRegrSurv
mlr_tasks_faithful

Old Faithful Eruptions Density Task
mlr_tasks_actg

ACTG 320 Survival Task
mlr_pipeops_trafopred_regrsurv

PipeOpPredRegrSurv
mlr_pipeops_trafopred_survregr

PipeOpPredSurvRegr
mlr_pipeops_trafotask_survregr

PipeOpTaskSurvRegr
mlr_task_generators_simsurv

Survival Task Generator for Package 'simsurv'
mlr_task_generators_simdens

Density Task Generator for Package 'distr6'
mlr_tasks_lung

Lung Cancer Survival Task
mlr_tasks_unemployment

Unemployment Duration Survival Task
whas

Worcester Heart Attack Study (WHAS) Dataset
mlr_pipeops_survavg

PipeOpSurvAvg
mlr_tasks_rats

Rats Survival Task
plot.LearnerSurv

Visualization of fitted LearnerSurv objects
mlr_tasks_gbcs

German Breast Cancer Study Survival Task
mlr_pipeops_compose_probregr

PipeOpProbregrCompositor
mlr_tasks_whas

Worcester Heart Attack Study (WHAS) Survival Task
pecs

Prediction Error Curves for PredictionSurv and LearnerSurv
mlr_tasks_precip

Annual Precipitation Density Task
mlr_tasks_grace

GRACE 1000 Survival Task