mlr3 v0.1.4

0

Monthly downloads

0th

Percentile

Machine Learning in R - Next Generation

Efficient, object-oriented programming on the building blocks of machine learning. Provides 'R6' objects for tasks, learners, resamplings, and measures. The package is geared towards scalability and larger datasets by supporting parallelization and out-of-memory data-backends like databases. While 'mlr3' focuses on the core computational operations, add-on packages provide additional functionality.

Readme

mlr3

Efficient, object-oriented programming on the building blocks of machine learning. Successor of mlr.

Build
Status CircleCI cran
checks

CRAN Status
Badge codecov StackOverflow Dependencies

Resources

Installation

remotes::install_github("mlr-org/mlr3")

Example

Constructing Learners and Tasks

library(mlr3)
set.seed(1)

# create learning task
task_iris = TaskClassif$new(id = "iris", backend = iris, target = "Species")
task_iris
## <TaskClassif:iris> (150 x 5)
## * Target: Species
## * Properties: multiclass
## * Features (4):
##   - dbl (4): Petal.Length, Petal.Width, Sepal.Length, Sepal.Width
# load learner and set hyperparamter
learner = lrn("classif.rpart", cp = 0.01)

Basic train + predict

# train/test split
train_set = sample(task_iris$nrow, 0.8 * task_iris$nrow)
test_set = setdiff(seq_len(task_iris$nrow), train_set)

# train the model
learner$train(task_iris, row_ids = train_set)

# predict data
prediction = learner$predict(task_iris, row_ids = test_set)

# calculate performance
prediction$confusion
##             truth
## response     setosa versicolor virginica
##   setosa         11          0         0
##   versicolor      0         12         1
##   virginica       0          0         6
measure = msr("classif.acc")
prediction$score(measure)
## classif.acc 
##   0.9666667

Resample

# automatic resampling
resampling = rsmp("cv", folds = 3L)
rr = resample(task_iris, learner, resampling)
## INFO  [13:33:15.014] Applying learner 'classif.rpart' on task 'iris' (iter 1/3) 
## INFO  [13:33:15.039] Applying learner 'classif.rpart' on task 'iris' (iter 2/3) 
## INFO  [13:33:15.057] Applying learner 'classif.rpart' on task 'iris' (iter 3/3)
rr$score(measure)
##             task task_id               learner    learner_id
## 1: <TaskClassif>    iris <LearnerClassifRpart> classif.rpart
## 2: <TaskClassif>    iris <LearnerClassifRpart> classif.rpart
## 3: <TaskClassif>    iris <LearnerClassifRpart> classif.rpart
##        resampling resampling_id iteration prediction classif.acc
## 1: <ResamplingCV>            cv         1     <list>        0.92
## 2: <ResamplingCV>            cv         2     <list>        0.92
## 3: <ResamplingCV>            cv         3     <list>        0.94
rr$aggregate(measure)
## classif.acc 
##   0.9266667

Why a rewrite?

mlr was first released to CRAN in 2013. Its core design and architecture date back even further. The addition of many features has led to a feature creep which makes mlr hard to maintain and hard to extend. We also think that while mlr was nicely extensible in some parts (learners, measures, etc.), other parts were less easy to extend from the outside. Also, many helpful R libraries did not exist at the time mlr was created, and their inclusion would result in non-trivial API changes.

Design principles

  • Only the basic building blocks for machine learning are implemented in this package.
  • Focus on computation here. No visualization or other stuff. That can go in extra packages.
  • Overcome the limitations of R’s S3 classes with the help of R6.
  • Embrace R6 for a clean OO-design, object state-changes and reference semantics. This might be less “traditional R”, but seems to fit mlr nicely.
  • Embrace data.table for fast and convenient data frame computations.
  • Combine data.table and R6, for this we will make heavy use of list columns in data.tables.
  • Be light on dependencies. mlr3 requires the following packages at runtime:
    • backports: Ensures backward compatibility with older R releases. Developed by members of the mlr team. No recursive dependencies.
    • checkmate: Fast argument checks. Developed by members of the mlr team. No extra recursive dependencies.
    • mlr3misc: Miscellaneous functions used in multiple mlr3 extension packages. Developed by the mlr team. No extra recursive dependencies.
    • paradox: Descriptions for parameters and parameter sets. Developed by the mlr team. No extra recursive dependencies.
    • R6: Reference class objects. No recursive dependencies.
    • data.table: Extension of R’s data.frame. No recursive dependencies.
    • digest: Hash digests. No recursive dependencies.
    • uuid: Create unique string identifiers. No recursive dependencies.
    • lgr: Logging facility. No extra recursive dependencies.
    • Metrics: Package which implements performance measures. No recursive dependencies.
    • mlbench: A collection of machine learning data sets. No dependencies.
  • Reflections: Objects are queryable for properties and capabilities, allowing you to program on them.
  • Additional functionality that comes with extra dependencies:

Talks, Workshops, etc.

mlr-outreach holds all outreach activities related to mlr and mlr3.

mlr3 talk at useR! 2019 conference in Toulouse, France:

Watch the
video

Functions in mlr3

Name Description
LearnerRegr Regression Learner
LearnerClassifDebug Classification Learner for Debugging
BenchmarkResult Container for Results of benchmark()
DataBackend DataBackend
Learner Learner Class
DataBackendDataTable DataBackend for data.table
LearnerClassifFeatureless Featureless Classification Learner
DataBackendMatrix DataBackend for Matrix
LearnerClassifRpart Classification Tree Learner
LearnerClassif Classification Learner
MeasureRegrMSE Mean Squared Error Regression Measure
MeasureRegrMAE Absolute Errors Regression Measure
MeasureClassifCE Classification Error Measure
MeasureClassifConfusion Binary Classification Measures Derived from a Confusion Matrix
MeasureRegrRMSE Root Mean Squared Error Regression Measure
Resampling Resampling Class
MeasureSelectedFeatures Selected Features Measure
ResamplingRepeatedCV Repeated Cross Validation Resampling
ResamplingHoldout Holdout Resampling
TaskGeneratorSmiley Smiley Classification Task Generator
TaskClassif Classification Task
ResamplingBootstrap Bootstrap Resampling
benchmark_grid Generate a Benchmark Grid Design
TaskGenerator TaskGenerator Class
confusion_measures Calculate Confusion Measures
TaskGenerator2DNormals 2d Normals Classification Task Generator
mlr_measures Dictionary of Performance Measures
LearnerRegrRpart Regression Tree Learner
MeasureElapsedTime Elapsed Time Measure
mlr_learners Dictionary of Learners
LearnerRegrFeatureless Featureless Regression Learner
TaskGeneratorFriedman1 Friedman1 Regression Task Generator
MeasureDebug Debug Measure
TaskGeneratorXor XOR Classification Task Generator
Measure Measure Class
MeasureClassifFScore F-score Classification Measure
MeasureClassifCosts Cost-sensitive Classification Measure
MeasureOOBError Out-of-bag Error Measure
MeasureClassifACC Accuracy Classification Measure
MeasureClassifAUC Area Under the Curve Classification Measure
ResamplingSubsampling Subsampling Resampling
Task Task Class
mlr_tasks Dictionary of Tasks
as.data.table Re-export of as.data.table See data.table::as.data.table.
as_benchmark_result Convert to BenchmarkResult
default_measures Get a Default Measure
MeasureClassif Classification Measure
ResamplingCV Cross Validation Resampling
ResamplingCustom Custom Resampling
MeasureRegr Regression Measure
mlr_tasks_mtcars "Motor Trend" Car Road Tests Task
Prediction Abstract Prediction Object
as_data_backend Create a Data Backend
predict.Learner Predict Method for Learners
mlr_tasks_pima Pima Indian Diabetes Classification Task
resample Resample a Learner on a Task
mlr3-package mlr3: Machine Learning in R - Next Generation
mlr_tasks_sonar Sonar Classification Task
benchmark Benchmark Multiple Learners on Multiple Tasks
mlr_reflections Reflections for mlr3
mlr_tasks_spam Spam Classification Task
mlr_resamplings Dictionary of Resampling Strategies
PredictionClassif Prediction Object for Classification
mlr_tasks_boston_housing Boston Housing Regression Task
mlr_tasks_german_credit German Credit Classification Task
ResampleResult Container for Results of resample()
PredictionRegr Prediction Object for Regression
TaskRegr Regression Task
mlr_tasks_iris Iris Classification Task
mlr_sugar Syntactic Sugar for Object Construction
mlr_assertions Assertion for mlr3 Objects
as_task.character Object Coercion
TaskSupervised Supervised Task
mlr_task_generators Dictionary of Task Generators
mlr_tasks_wine Wine Classification Task
mlr_tasks_zoo Zoo Classification Task
No Results!

Last month downloads

Details

License LGPL-3
URL https://mlr3.mlr-org.com, https://github.com/mlr-org/mlr3
BugReports https://github.com/mlr-org/mlr3/issues
RdMacros mlr3misc
Encoding UTF-8
LazyData true
NeedsCompilation no
RoxygenNote 6.1.1
Collate 'mlr_reflections.R' 'BenchmarkResult.R' 'DataBackend.R' 'DataBackendCbind.R' 'DataBackendDataTable.R' 'DataBackendMatrix.R' 'DataBackendRbind.R' 'DataBackendRename.R' 'Learner.R' 'LearnerClassif.R' 'mlr_learners.R' 'LearnerClassifDebug.R' 'LearnerClassifFeatureless.R' 'LearnerClassifRpart.R' 'LearnerRegr.R' 'LearnerRegrFeatureless.R' 'LearnerRegrRpart.R' 'Measure.R' 'MeasureClassif.R' 'mlr_measures.R' 'MeasureClassifACC.R' 'MeasureClassifAUC.R' 'MeasureClassifCE.R' 'MeasureClassifConfusion.R' 'MeasureClassifCosts.R' 'MeasureClassifFScore.R' 'MeasureDebug.R' 'MeasureElapsedTime.R' 'MeasureOOBError.R' 'MeasureRegr.R' 'MeasureRegrMAE.R' 'MeasureRegrMSE.R' 'MeasureRegrRMSE.R' 'MeasureSelectedFeatures.R' 'Prediction.R' 'PredictionClassif.R' 'PredictionRegr.R' 'ResampleResult.R' 'Resampling.R' 'mlr_resamplings.R' 'ResamplingBootstrap.R' 'ResamplingCV.R' 'ResamplingCustom.R' 'ResamplingHoldout.R' 'ResamplingRepeatedCV.R' 'ResamplingSubsampling.R' 'Task.R' 'TaskSupervised.R' 'TaskClassif.R' 'mlr_tasks.R' 'TaskClassif_german_credit.R' 'TaskClassif_iris.R' 'TaskClassif_pima.R' 'TaskClassif_sonar.R' 'TaskClassif_spam.R' 'TaskClassif_wine.R' 'TaskClassif_zoo.R' 'TaskGenerator.R' 'mlr_task_generators.R' 'TaskGenerator2DNormals.R' 'TaskGeneratorFriedman1.R' 'TaskGeneratorSmiley.R' 'TaskGeneratorXor.R' 'TaskRegr.R' 'TaskRegr_boston_housing.R' 'TaskRegr_mtcars.R' 'Task_mutators.R' 'as_data_backend.R' 'assertions.R' 'benchmark.R' 'benchmark_grid.R' 'default_measures.R' 'deprecated.R' 'helper.R' 'mlr_coercions.R' 'mlr_sugar.R' 'predict.R' 'reexports.R' 'resample.R' 'worker.R' 'zzz.R'
Packaged 2019-10-28 18:41:16 UTC; michel
Repository CRAN
Date/Publication 2019-10-28 23:40:06 UTC

Include our badge in your README

[![Rdoc](http://www.rdocumentation.org/badges/version/mlr3)](http://www.rdocumentation.org/packages/mlr3)