Learner: Learner Class

Description

This is the abstract base class for learner objects like LearnerClassif and LearnerRegr.

Learners consist of the following parts:

Methods train() and predict() to perform the respective steps.
The fitted model, after calling train().
A paradox::ParamSet which stores meta-information about available hyperparameters, and also stores hyperparameter settings.
Meta-information about the requirements and capabilities of the learner.

Predefined learners are stored in the Dictionary mlr_learners, e.g. classif.rpart or regr.rpart.

Arguments

Format

R6::R6Class object.

Construction

Note: This object is typically constructed via a derived classes, e.g. LearnerClassif or LearnerRegr.

l = Learner$new(id, task_type, param_set = ParamSet$new(), param_vals = list(), predict_types = character(),
     feature_types = character(), properties = character(), packages = character())

id :: character(1) Identifier for the learner.
task_type :: character(1) Type of the task the learner can operator on. E.g., "classif" or "regr".
param_set :: paradox::ParamSet Set of hyperparameters.
param_vals :: named list() List of hyperparameter settings.
predict_types :: character() Supported predict types. Must be a subset of mlr_reflections$learner_predict_types.
feature_types :: character() Feature types the learner operates on. Must be a subset of mlr_reflections$task_feature_types.
properties :: character() Set of properties of the learner. Must be a subset of mlr_reflections$learner_properties.
data_formats :: character() Vector of supported data formats which can be processed during $train() and $predict(). Defaults to "data.table".
packages :: character() Set of required packages. Note that these packages will be loaded via requireNamespace(), and are not attached.

Fields

id :: character(1) Identifier of the learner.
task_type :: character(1) Stores the type of class this learner can operate on, e.g. "classif" or "regr". A complete list of task types is stored in mlr_reflections$task_types.
param_set :: paradox::ParamSet Description of available hyperparameters and hyperparameter settings.
predict_types :: character() Stores the possible predict types the learner is capable of. A complete list of candidate predict types, grouped by task type, is stored in mlr_reflections$learner_predict_types.
predict_type :: character(1) Stores the currently selected predict type. Must be an element of l$predict_types.
feature_types :: character() Stores the feature types the learner can handle, e.g. "logical", "numeric", or "factor". A complete list of candidate feature types, grouped by task type, is stored in mlr_reflections$task_feature_types.
properties :: character() Stores a set of properties/capabilities the learner has. A complete list of candidate properties, grouped by task type, is stored in mlr_reflections$learner_properties.
packages :: character() Stores the names of required packages.
hash :: character(1) Hash (unique identifier) for this object.
model :: any The fitted model. Only available after $train() has been called.
timings :: numeric(2) Elapsed time in seconds for the steps "train" and "predict".
log :: data.table::data.table() Returns the output (including warning and errors) as table with columns "stage" (train or predict), "class" (output, warning, error) and "msg" (character()).
warnings :: character() Returns the logged warnings as vector.
errors :: character() Returns the logged errors as vector.

Methods

train(task, row_ids = NULL, ctrl = list()) (Task, integer() | character(), mlr_control()) -> Learner Train the learner on the row ids of the provided Task. Mutates the learner by reference, e.g. stores the model in field $data.
predict(task, row_ids = NULL, ctrl = list()) (Task, integer() | character(), mlr_control()) -> Prediction Uses the data stored during $train() to create a new Prediction based on the provided row_ids of the task.
predict_newdata(task, newdata, ctrl = list()) (Task, data.frame(), mlr_control()) -> Prediction Uses the data stored during $train() to create a new Prediction based on the new data in newdata. Object task is the task used during $train() and required for conversions of newdata.
new_prediction(row_ids, truth, ...) (integer() | character(), any, ...) -> Prediction Used internally to create a Prediction object. The arguments are described in the respective specialization of Prediction, e.g. in PredictionClassif for classification.

Optional Extractors

Specific learner implementations are free to implement additional getters to ease the access of certain parts of the model in the inherited subclasses.

For the following operations, extractors are standardized:

importance(...): Returns the feature importance score as numeric vector. The higher the score, the more important the variable. The returned vector is named with feature names and sorted in decreasing order. Note that the model might omit features it has not used at all. The learner must be tagged with property "importance".
selected_features(...): Returns a subset of selected features as character(). The learner must be tagged with property "selected_features".
oob_error(...): Returns the out-of-bag error of the model as numeric(1). The learner must be tagged with property "oob_error".

Setting Hyperparameters

All information about hyperparameters is stored in the slot param_set which is a paradox::ParamSet. The printer gives an overview about the ids of available hyperparameters, their storage type, lower and upper bounds, possible levels (for factors), default values and assigned values. To set hyperparameters, assign a named list to the subslot values:

lrn = mlr_learners$get("classif.rpart")
lrn$param_set$values = list(minsplit = 3, cp = 0.01)

Note that this operation replaces all previously set hyperparameter values. If you only intend to change one specific hyperparameter value and leave the others as-is, you can use the helper function mlr3misc::insert_named():

lrn$param_set$values = mlr3misc::insert_named(lrn$param_set$values, list(cp = 0.001))

If the learner has additional hyperparameters which are not encoded in the ParamSet, you can easily extend the learner. Here, we add a hyperparameter with id "foo" possible levels "a" and "b":

lrn$param_set$add(paradox::ParamFct$new("foo", levels = c("a", "b")))