pairplot: Create partial dependence plot for a pair of predictor variables in a prediction rule ensemble (pre)

Description

pairplot creates a partial dependence plot to assess the effects of a pair of predictor variables on the predictions of the ensemble. Note that plotting partial dependence is computationally intensive. Computation time will increase fast with increasing numbers of observations and variables. For large datasets, package `plotmo` (Milborrow, 2019) provides more efficient functions for plotting partial dependence and also supports `pre` models.

Usage

pairplot(
  object,
  varnames,
  type = "both",
  gamma = NULL,
  penalty.par.val = "lambda.1se",
  response = NULL,
  nvals = c(20L, 20L),
  pred.type = "response",
  newdata = NULL,
  xlab = NULL,
  ylab = NULL,
  main = NULL,
  ...
)

Arguments

object: an object of class pre
varnames: character vector of length two. Currently, pairplots can only be requested for non-nominal variables. If varnames specifies the name(s) of variables of class "factor", an error will be printed.
type: character string. Type of plot to be generated. type = "heatmap" yields a heatmap plot, type = "contour" yields a contour plot, type = "both" yields a heatmap plot with added contours, type = "perspective" yields a three dimensional plot.
gamma: Mixing parameter for relaxed fits. See coef.cv.glmnet.
penalty.par.val: character or numeric. Value of the penalty parameter $\lambda$ to be employed for selecting the final ensemble. The default "lambda.min" employs the $\lambda$ value within 1 standard error of the minimum cross-validated error. Alternatively, "lambda.min" may be specified, to employ the $\lambda$ value with minimum cross-validated error, or a numeric value $>0$ may be specified, with higher values yielding a sparser ensemble. To evaluate the trade-off between accuracy and sparsity of the final ensemble, inspect pre_object$glmnet.fit and plot(pre_object$glmnet.fit).
response: numeric vector of length 1. Only relevant for multivariate gaussian and multinomial responses. If NULL (default), PDPs for all response variables or categories will be produced. A single integer can be specified, indicating for which response variable or category PDPs should be produced.
nvals: optional numeric vector of length 2. For how many values of x1 and x2 should partial dependence be plotted? If NULL, a grid of all possible combinations of the observed values of the two predictor variables specified will be used (see details).
pred.type: character string. Type of prediction to be plotted on z-axis. pred.type = "response" gives fitted values for continuous outputs and fitted probabilities for nominal outputs. pred.type = "link" gives fitted values for continuous outputs and linear predictor values for nominal outputs.
newdata: Optional data.frame in which to look for variables with which to predict. If NULL, the data.frame used to fit the original ensemble will be used.
xlab: character. Label to be printed on the x-axis. If NULL, the first elements of the supplied varnames will be printed on the x-axis.
ylab: character. Label to be printed on the y-axis. If NULL, the second element of the supplied varnames will be printed on the y-axis.
main: Title for the plot. If NULL, the name of the response will be printed.
...: Further arguments to be passed to image, contour or persp (depending on whether type is specified to be "heatmap", "contour", "both" or "perspective").

Details

Partial dependence functions are described in section 8.1 of Friedman & Popescu (2008).

By default, partial dependence will be plotted for each combination of 20 values of the specified predictor variables. When nvals = NULL is specified, a dependence plot will be created for every combination of the unique observed values of the two specified predictor variables. If NA instead of a numeric value is specified for one of the predictor variables, all observed values for that variables will be used. Specifying nvals = NULL and nvals = c(NA, NA) will yield the exact same result.

High values, NA or NULL for nvals result in long computation times and possibly memory problems. Also, pre ensembles derived from training datasets that are very wide or long may result in long computation times and/or memory allocation errors. In such cases, reducing the values supplied to nvals will reduce computation time and/or memory allocation errors.

When numeric value(s) are specified for nvals, values for the minimum, maximum, and nvals - 2 intermediate values of the predictor variable will be plotted.

Alternatively, newdata can be specified to provide a different (smaller) set of observations to compute partial dependence over. If mi_pre was used to derive the original rule ensemble, newdata = "mean.mi" can be specified. This will result in an average dataset being computed over the imputed datasets, which are then used to compute partial dependence functions. This greatly reduces the number of observations and thereby computation time.

If none of the variables specified with argument varnames was selected for the final prediction rule ensemble, an error will be returned.

References

Friedman, J. H., & Popescu, B. E. (2008). Predictive learning via rule ensembles. The Annals of Applied Statistics, 2(3), 916-954.

Milborrow, S. (2019). plotmo: Plot a model's residuals, response, and partial dependence plots. https://CRAN.R-project.org/package=plotmo

Examples

Run this code

airq <- airquality[complete.cases(airquality),]
set.seed(42)
airq.ens <- pre(Ozone ~ ., data = airq)
pairplot(airq.ens, c("Temp", "Wind"))

## For multinomial and mgaussian families, one PDP is created per category or outcome
set.seed(42)
airq.ens3 <- pre(Ozone + Wind ~ ., data = airq, family = "mgaussian")
pairplot(airq.ens3, varnames = c("Day", "Month"))

set.seed(42)
iris.ens <- pre(Species ~ ., data = iris, family = "multinomial")
pairplot(iris.ens, varname = c("Petal.Width", "Petal.Length"))