TaskClassif

0th

Percentile

Classification Task

This task specializes Task and TaskSupervised for classification problems. The target column is assumed to be a factor. Predefined tasks are stored in mlr_tasks.

The task_type is set to "classif".

Keywords
datasets
Format

R6::R6Class object inheriting from Task/TaskSupervised.

Construction

t = TaskClassif$new(id, backend, target, positive = NULL)
  • id :: character(1) Name of the task.

  • backend :: DataBackend

  • target :: character(1) Name of the target column.

  • positive :: character(1) Only for binary classification: Name of the positive class.

Fields

  • all_classes :: character() Returns all class labels of the task, regardless of the number of active rows.

  • class_names :: character() Returns all class labels of the task w.r.t. the active rows.

  • class_n :: integer(1) Returns the number of classes.

  • negative :: character(1) Stores the negative class for binary classification tasks, and NA for multiclass tasks.

  • positive :: character(1) Stores the positive class for binary classification tasks, and NA for multiclass tasks.

  • backend :: DataBackend.

  • col_info :: data.table::data.table() Table with with 3 columns: Column names of DataBackend are stored in columnid. Column type holds the storage type of the variables, e.g. integer, numeric or character. Column levels keeps a list of possible levels for factor and character variables.

  • col_roles :: named list() Each column (feature) can have an arbitrary number of roles in the learning task:

    • "target": Labels to predict.

    • "feature": Regular feature.

    • "order": Data returned by data() is ordered by this column (or these columns).

    • "group": During resampling, observations with the same value of the variable with role "group" are marked as "belonging together". They will be exclusively assigned to be either in the training set or the test set for each resampling iteration.

    • "weights": Observation weights. col_roles keeps track of the roles with a named list of vectors of feature names. To alter the roles, use t$set_col_role().

  • row_roles :: named list() Each row (observation) can have an arbitrary number of roles in the learning task:

    • "use": Use in train / predict / resampling.

    • "validation": Hold the observations back unless explicitly requested. row_roles keeps track of the roles with a named list of vectors of feature names. To alter the role, use set_row_role().

  • feature_names :: character() Returns all column names with role == "feature".

  • feature_types :: data.table::data.table() Returns a table with columns id and type where id are the column names of "active" features of the task and type is the storage type.

  • formula :: formula() Constructs a stats::formula, e.g. [target] ~ [feature_1] + [feature_2] + ... + [feature_k], using the active features of the task.

  • group :: data.table::data.table() Returns a table with columns row_id and group where row_id are the row ids and group is the value of the grouping variable. Returns NULL if there is no grouping.

  • hash :: character(1) Hash (unique identifier) for this object.

  • id :: character(1) Stores the identifier of the Task.

  • measures :: list() of Measure Stores the measures to use for this task.

  • ncol :: integer(1) Returns the total number of cols with role "target" or "feature".

  • nrow :: integer(1) Return the total number of rows with role "use".

  • row_ids :: (integer() | character()) Returns the row ids of the DataBackend for observations with with role "use".

  • target_names :: character() Returns all column names with role "target".

  • task_type :: character(1) Stores the type of the Task.

Methods

  • data(rows = NULL, cols = NULL, format = NULL) (integer() | character(), character(), character(1)) -> any Returns a slice of the data from the DataBackend in the format specified by format (depending on the DataBackend, but usually a data.table::data.table()). Rows are subsetted to only contain observations with role "use". Columns are filtered to only contain features with roles "target" and "feature". If invalid rows or cols are specified, an exception is raised.

  • cbind(data) data.frame() -> self Extends the DataBackend with additional columns. The row ids must be provided as column in data (with column name matching the primary key name of the DataBackend). If this column is missing, it is assumed that the rows are exactly in the order of t$row_ids.

  • rbind(data) data.frame() -> self Extends the DataBackend with additional rows. The new row ids must be provided as column in data. If this column is missing, new row ids are constructed automatically.

  • filter(rows) (integer() | character()) -> self Subsets the task, reducing it to only keep the rows specified.

  • select(cols) character() -> self Subsets the task, reducing it to only keep the columns specified.

  • levels(col) character() -> named list() Returns the distinct levels of the column col. Only applicable for features with type "character", "factor" or "ordered". This function ignores the row roles, it returns all levels available in the DataBackend.

  • head(n = 6) integer() -> data.table::data.table() Get the first n observations with role "use".

  • replace_features(data) data.frame() -> self Replaces some features of the task by constructing a completely new DataBackendDataTable. This operation is similar to calling select() and cbind(), but explicitly copies the data.

  • set_col_role(cols, new_roles, exclusive = TRUE) (character(), character(), logical(1)) -> self Adds the roles new_roles to columns referred to by cols. If exclusive is TRUE, the referenced columns will be removed from all other roles.

  • set_row_role(rows, new_roles, exclusive = TRUE) (character(), character(), logical(1)) -> self Adds the roles new_roles to rows referred to by rows. If exclusive is TRUE, the referenced rows will be removed from all other roles.

See Also

Other Task: TaskRegr, TaskSupervised, Task, mlr_generators, mlr_tasks

Aliases
  • TaskClassif
Examples
# NOT RUN {
b = as_data_backend(iris)
task = TaskClassif$new("iris", backend = b, target = "Species")
task$task_type
task$formula
task$truth()
task$all_classes
task$class_names

data("Sonar", package = "mlbench")
b = as_data_backend(Sonar)
task = TaskClassif$new("sonar", backend = b, target = "Class", positive = "M")
task$positive
task$negative
# }
Documentation reproduced from package mlr3, version 0.1.0-9000, License: MIT + file LICENSE

Community examples

Looks like there are no examples yet.