mlr (version 2.17.0)

plotLearnerPrediction: Visualizes a learning algorithm on a 1D or 2D data set.

Description

Trains the model for 1 or 2 selected features, then displays it via ggplot2::ggplot. Good for teaching or exploring models.

For classification and clustering, only 2D plots are supported. The data points, the classification and potentially through color alpha blending the posterior probabilities are shown.

For regression, 1D and 2D plots are supported. 1D shows the data, the estimated mean and potentially the estimated standard error. 2D does not show estimated standard error, but only the estimated mean via background color.

The plot title displays the model id, its parameters, the training performance and the cross-validation performance.

Usage

plotLearnerPrediction(
  learner,
  task,
  features = NULL,
  measures,
  cv = 10L,
  ...,
  gridsize,
  pointsize = 2,
  prob.alpha = TRUE,
  se.band = TRUE,
  err.mark = "train",
  bg.cols = c("darkblue", "green", "darkred"),
  err.col = "white",
  err.size = pointsize,
  greyscale = FALSE,
  pretty.names = TRUE
)

Arguments

learner

(Learner | character(1)) The learner. If you pass a string the learner will be created via makeLearner.

task

(Task) The task.

features

(character) Selected features for model. By default the first 2 features are used.

measures

(Measure | list of Measure) Performance measure(s) to evaluate. Default is the default measure for the task, see here getDefaultMeasure.

cv

(integer(1)) Do cross-validation and display in plot title? Number of folds. 0 means no CV. Default is 10.

...

(any) Parameters for learner.

gridsize

(integer(1)) Grid resolution per axis for background predictions. Default is 500 for 1D and 100 for 2D.

pointsize

(numeric(1)) Pointsize for ggplot2 ggplot2::geom_point for data points. Default is 2.

prob.alpha

(logical(1)) For classification: Set alpha value of background to probability for predicted class? Allows visualization of “confidence” for prediction. If not, only a constant color is displayed in the background for the predicted label. Default is TRUE.

se.band

(logical(1)) For regression in 1D: Show band for standard error estimation? Default is TRUE.

err.mark

(character(1)): For classification: Either mark error of the model on the training data (“train”) or during cross-validation (“cv”) or not at all with “none”. Default is “train”.

bg.cols

(character(3)) Background colors for classification and regression. Sorted from low, medium to high. Default is TRUE.

err.col

(character(1)) For classification: Color of misclassified data points. Default is “white”

err.size

(integer(1)) For classification: Size of misclassified data points. Default is pointsize.

greyscale

(logical(1)) Should the plot be greyscale completely? Default is FALSE.

pretty.names

(logical(1)) Whether to use the short name of the learner instead of its ID in labels. Defaults to TRUE.

Value

The ggplot2 object.