plotmo

0th

Percentile

Plot a model's response over a range of predictor values

Plot a model's response when varying one or two predictors while holding the other predictors constant. A poor man's partial dependence plot. See the Details section for an overview.

Keywords
regression, partial dependence
Usage
plotmo(object = stop("no 'object' arg"),
       type=NULL, nresponse = NA, clip = TRUE, ylim = NULL,
       center = FALSE, ndiscrete = 5,
       degree1 = TRUE, all1=FALSE, degree2 = TRUE, all2=FALSE, int.only.ok = TRUE,
       grid.func = median, grid.levels = NULL,
       col.response = 0, cex.response = NULL, pch.response = 20,
       jitter.response=0, npoints = -1, inverse.func = NULL,
       level = 0, shade.pints = "mistyrose2", shade2.pints = "mistyrose4",
       trace = 0,
       nrug = 0, col.degree1 = 1, lty.degree1 = 1, lwd.degree1 = 1,
       col.smooth = 0, lty.smooth = 1, lwd.smooth = 1, smooth.f = .5,
       func = NULL, col.func = "lightblue", lty.func = 1, lwd.func = 1,
       ngrid1 = 50,  grid = FALSE,
       type2 = "persp", ngrid2 = 20,
       col.image = gray(0:10/10), col.persp = "lightblue",
       theta = NA, phi = 30, dvalue = 1, shade = 0.5,
       do.par = TRUE, caption = NULL, main = NULL,
       xlim = NULL, xlab = "", ylab = "", cex = NULL, cex.lab = 1,
       xflip = FALSE, yflip = FALSE, swapxy = FALSE, se = 0, ...)
Arguments
object
Model object.
type
Type parameter passed to predict. For allowed values see the predict method for your object (such as predict.earth
nresponse
Which column to use when predict returns multiple columns. This can be a column index or column name (which may be abbreviated, partial matching is used). Ignored when predict returns a single column.
clip
Default is TRUE, meaning plot only predicted values that are in the expected range. Use FALSE to plot all values. See The clip argument section below for details.
ylim
Three possibilities: (i) NULL (default) all y axes have same limits (where y is actually z on degree2 plots). The limits are the min and max values of the predicted response across all plots (a
center
Center the plotted response. Default is FALSE. (This is an initial implementation of centering and will change.)
ndiscrete
Default 5 (a somewhat arbitrary value). Variables with no more than ndiscrete unique values are plotted as quantized in plots (a staircase rather than a curve). Factors are always considered discrete.
degree1
Index vector specifying which subset of main effect plots to include. Default is TRUE, meaning all degree1 plots (the TRUE gets recycled). Use FALSE (or 0) for no degree1 plots. Note that this
all1
Default is FALSE. Use TRUE to plot all predictors, not just those usually selected by plotmo. See Which variables are plotted? below. The all1 argument increases the
degree2
Index vector specifying which subset of interaction plots to include. Default is TRUE, meaning all degree2 plots. New in version 1.3-0: degree2 may be a character vector specifying which variables to plot (<
all2
Default is FALSE. Use TRUE to plot all pairs of predictors, not just those usually selected by plotmo. The all2 argument increases the number of plots; the degree2 argume
int.only.ok
New in version 2.0.0. Plot the model even if it is an intercept-only model. Do this by plotting a single degree1 plot for the first predictor. Use int.only.ok=FALSE to instead issue an error message for intercept-only mo
grid.func
Function applied to columns of the x matrix to fix the values of variables not on the axes. Default is median. (This argument is ignored for factors. The first level of fact
grid.levels
Default is NULL. Else a list of variables and their fixed value to be used when the variable is not on the axis. Supersedes grid.func for variables in the list. Names and values can be abbreviated, partial mat
col.response
Color of response points (or response sites in degree2 plots). This refers to the response y in the data used to build the model. Default is 0, don't plot the response. Can be a vector, for example, col.response=as.n
cex.response
Relative size of response points. Default is NULL, meaning automatic. Applies only if col.response!=0.
pch.response
Plot character for response points. Default is 20. Applies only if col.response!=0.
jitter.response
Amount to jitter the response points. Applies only if col.response!=0. Default 0, no jitter. A typical useful value is .3, but it depends on the data. Points are jittered horizontally and vertical
npoints
Number of response points to be plotted. Applies only if col.response!=0. Default is the special value -1 meaning all. Otherwise a sample of npoints points is taken.
inverse.func
Default is NULL. Else a function applied to the predicted response before plotting. For example, you could use inverse.func=exp if your model formula is log(y)~x.
level
New in version 2.0.0. Draw estimated confidence or prediction interval bands at the given level. Default is 0, bands not plotted. Else a fraction, something like level=.95. See Prediction interv
shade.pints
Color of level bands. Default is "mistyrose2", a pale pink. Applies only if level > 0.
shade2.pints
Color of additional level bands. Default is "mistyrose4". Applies only if level > 0 and currently only for lm models.
trace
Default is 0. Use 1 (or TRUE) to trace operation. Use values greater than 1 for more detailed tracing. Use -1 to inhibit the grid: message usually issued by plotmo.
nrug
Number of points in (jittered) rug. Default is 0, no rug. Special value -1 for all. Otherwise a sample of nrug points is taken.
col.degree1
Color of degree1 lines. Default is 1.
lty.degree1
Line type of degree1 lines. Default is 1.
lwd.degree1
Line width of degree1 lines. Default is 1.
col.smooth
Color of smoothed line through the response points. (The points themselves will not be plotted unless col.response is set.) This refers to the response y in the data used to build the model. Default is 0
lty.smooth
Default is 1. Applies only if col.smooth!=0.
lwd.smooth
Default is 1. Applies only if col.smooth!=0.
smooth.f
f parameter in internal call to lowess. Default is .5. Applies only if col.smooth!=0.
func
Superimpose func(x) if func is not NULL. Default is NULL. This is useful if you are comparing the model to a known function. The func is called for each plot with a single arg
col.func
Color of func line. Default is "lightblue".
lwd.func
Line width of func line. Default is 1.
lty.func
Line type of func line. Default is 1.
ngrid1
Number of points in degree1 plots. Default is 50.
grid
Default FALSE. Use TRUE to add a grid to the degree1 plots. You can also specify a color here, e.g. grid="darkgray". The following arguments are for degree
type2
Degree2 plot type. One of "persp" (default), "contour", or "image".
ngrid2
Grid size for degree2 plots (ngrid2 x ngrid2 points are plotted, but less for factors and variables with less than ngrid2 discrete values). Default is 20. Note 1: the default will often be too small for
col.image
Colors of image plot. Default is gray(0:10/10), a range of grays. Clipped values will be displayed in blue (only applies if clip=TRUE).
col.persp
Color of persp surface. Default is "lightblue". Use 0 for no color.
theta
Rotation parameter for persp. Default is NA, meaning automatically rotate each graph so the highest corner is furthest away. Use trace=1 to see the calculated value fo
phi
Passed to persp. Default is 30. Lower values to view from the side; higher to view from above.
dvalue
Passed to persp as d. Default is 1. The name was changed from d to avoid partial matching problems.
shade
Passed to persp. Default is 0.5. The following are related to par and other graphical settings.
do.par
Default is TRUE, meaning start a new page and call par as appropriate (this adjusts mfrow, cex, mar, and mgp). Use FALSE to use
caption
Overall caption. By default create captions automatically from the type, response name, and call.
main
A vector of titles, one for each plot. By default generate titles automatically from the variable names. See also caption, for the overall title.
xlim
Default is NULL, meaning automatically calculate xlim for each graph. Else used the given xlim for all degree1 graphs.
xlab
Horizontal axis label on degree1 plots (for degree2 plots the labels are always the predictor names). Default is "", no label, which gives more plottable area. The special value NULL means use the current variable
ylab
Vertical axis label. Values as for xlab.
cex
Character expansion.
cex.lab
Relative size of axis labels and text. Default 1.
xflip
Default FALSE. Use TRUE to flip the direction of the x axis. This argument (and yflip and swapxy) is useful when comparing to a plot from another source and you want the axes
yflip
Default FALSE. Use TRUE to flip the direction of the y axis of the degree2 graphs.
swapxy
Default FALSE. Use TRUE to swap the x and y axes on the degree2 graphs.
se
Deprecated. Please use the level argument instead. So instead of se=2 use level=.95.
...
Extra arguments are passed on to the plotting functions. (For persp plots, ticktype="d", nticks=2 is useful.)
Details

Plotmo can be used on a wide variety of regression models. It plots a degree1 (main effect) plot by calling predict to predict the response when changing one variable while holding all other variables at their median values. For degree2 (interaction) plots, two variables are changed while holding others at their medians. The first level is used instead of the median for factors. You can change this with the grid.func and grid.levels arguments.

Each graph shows only a thin slice of the data because most variables are fixed. Please be aware of that when interpreting the graph --- over-interpretation is a temptation.

Plotmo was originally part of the earth package and a few connections to that package still remain.

concept

partial dependence plot

Limitations

NAs are not yet supported. To prevent confusing error messages from functions called by plotmo, it is safest to remove NAs before building your model. (However, rpart models are treated specially by plotmo. For these, plotmo predicts with na.pass so plotmo can be used with rpart's default NA handling.)

Keep the variable names in the original model formula simple. Use temporary variables or attach rather than using $ and similar in formulas.

Plotmo evaluates the model data in the environment used when the model was built, if that environment was saved with the model (typically this is the case if the formula interface was used to the model function). If the environment was not saved with the model (typically if the x,y interface was used), the model data is evaluated in the environment in which plotmo is called.

Alternatives

An alternative approach is to use partial-dependence plots (e.g. The Elements of Statistical Learning 10.13.2). Plotmo sets the other variables to their median value, whereas in a partial-dependence plot at each plotted point the effect of the other variables is averaged. In general, partial-dependence plots and plotmo plots will differ, but for additive models the shape of the curves will match identically. Eventually plotmo will be enhanced to draw partial-dependence plots. Termplot is effective but can be used only on models with a predict method that supports type="terms", and it does not generate degree2 plots.

Some other possibilites for plotting the response on a per-predictor basis are partial-residual plots, partial-regression variable plots, and marginal-model plots (e.g. crPlots, avPlots, and marginalModelPlot in the car-package). These plots are orientated towards linear models. The effects-package package is also of interest.

The clip argument

With the default clip=TRUE, predicted values out of the expected range are not displayed.

Generally, the expected range is the range of the response y used when building the model. But that depends on the type of model, and plotmo knows about some special cases. For example, it knows that for some models we are predicting a probability, and it scales the axes accordingly, 0 to 1. However, plotmo cannot know about every possible model and prediction type, and will sometimes determine the expected response range incorrectly. In that case use clip=FALSE.

The default clip is TRUE because it is a useful sanity check to test that the predicted values are in the expected range. While not necessarily an error, predictions outside the expected range are usually something we want to know about. Also, with clip=FALSE, a few errant predictions can expand the entire y-axis, making it difficult to see the shape of the other predictions.

Using plotmo on various models

Here are some examples which illustrate plotmo on various objects. (The models here are just for illustrating plotmo and shouldn't be taken too seriously.) # use a small set of variables for illustration library(earth) # for ozone1 data data(ozone1) oz <- ozone1[, c("O3", "humidity", "temp", "ibt")]

lm.model <- lm(O3 ~ humidity + temp*ibt, data=oz) # linear model plotmo(lm.model, col.response="gray", nrug=-1)

library(rpart) # rpart rpart.model <- rpart(O3 ~ ., data=oz) plotmo(rpart.model, all2=TRUE)

library(randomForest) # randomForest rf.model <- randomForest(O3~., data=oz) plotmo(rf.model) # partialPlot(rf.model, oz, temp) # compare to partial-dependence plot

library(gbm) # gbm gbm.model <- gbm(O3~., data=oz, dist="gaussian", inter=2, n.trees=1000) plotmo(gbm.model) # plot(gbm.model, i.var=2) # compare to partial-dependence plots # plot(gbm.model, i.var=c(2,3))

library(mgcv) # gam gam.model <- gam(O3 ~ s(humidity)+s(temp)+s(ibt)+s(temp,ibt), data=oz) plotmo(gam.model, level=.95, all2=TRUE)

library(nnet) # nnet set.seed(4) nnet.model <- nnet(O3~., data=scale(oz), size=2, decay=0.01, trace=FALSE) plotmo(nnet.model, type="raw", all2=T)

library(MASS) # qda lcush <- data.frame(Type=as.numeric(Cushings$Type),log(Cushings[,1:2])) lcush <- lcush[1:21,] qda.model <- qda(Type~., data=lcush) plotmo(qda.model, type="class", all2=TRUE, type2="contour", ngrid2=100, nlevels=2, drawlabels=FALSE, col.response=as.numeric(lcush$Type)+1, pch.response=as.character(lcush$Type))

Prediction intervals

Use the level argument to plot pointwise confidence or prediction intervals. The predict method of the object must support this. Examples: par(mfrow=c(2,3)) log.trees <- log(trees) # make the resids more homoscedastic # (necessary for lm)

# lm lm.model <- lm(Volume~Height, data=log.trees) plot(lm.model, which=1, main="lm") # Residual vs Fitted graph plotmo(lm.model, level=.90, col.response=2, main="lm\n(conf and pred intervals)", do.par=F)

# earth (requires earth 3.3) library(earth) earth.model <- earth(Volume~Height, data=log.trees, nfold=5, ncross=30, varmod.method="lm") plotmo(earth.model, level=.90, col.response=2, main="earth", do.par=F)

# quantreg library(quantreg) rq.model <- rq(Volume~Height, data=log.trees, tau=c(.05, .5, .95)) plotmo(rq.model, level=.90, col.response=2, main="rq", do.par=F)

# quantregForest # quantregForest is a layer on randomForest that allows prediction intervals library(quantregForest) x <- data.frame(Height=log.trees$Height) qrf <- quantregForest(x, log.trees$Volume) plotmo(qrf, level=.90, col.response=2, main="qrf", do.par=F)

# gam library(mgcv) gam.model <- gam(Volume~s(Height), data=log.trees) plotmo(gam.model, level=.90, col.response=2, main="gam\n(conf not pred intervals)", do.par=F)

These are pointwise limits. They can only be interpreted in a pointwise fashion. So for non-parametric models they shouldn't be used to infer bumps or dips that are dependent on a range of the curve. For that you need simultaneous confidence bands, which none of the models above support.

Common error messages

oError in match.arg(type): 'arg' should be one of ...

The message is probably issued by the predict method for your model object. Set type to a legal value as described on the help page for the predict method for your object.oError: predicted values are out of ylim, try clip=FALSE

Probably plotmo has incorrectly determined the expected range of the response, and hence also ylim. Re-invoke plotmo with clip=FALSE. See the section The clip argument.oError: predict.lm(xgrid, type="response") returned the wrong length

oWarning: 'newdata' had 100 rows but variable(s) found have 30 rows

oError: variable 'x' was fitted with type "nmatrix.2" but type "numeric" was supplied

oError in model.frame: invalid type (list) for variable 'x[,3]'

These and similar messages usually mean that predict is misinterpreting the new data generated by plotmo.

The underlying issue is that many predict methods, including predict.lm, seem to reject any reasonably constructed new data if the function used to create the model was called in an unconventional way. The work-around is to simplify or standardize the way the model function is called. Use a formula and a data frame, or at least explicitly name the variables rather than passing a matrix. Use simple variable names (so x1 rather than dat$x1, for example).

If the symptoms persist after changing the way the model is called, and the model is not one of those listed in Which variables are plotted, it is possible that the model class is not supported by plotmo. See Extending plotmo.oError: get.plotmo.x.default cannot get the x matrix

This and similar messages mean that plotmo cannot get the data it needs from the model object.

You can try simplifying and standardizing the way the model function is called, as described above. Perhaps you need to use keepxy or similar in the call to the model function, so the data is attached to the object and available for plotmo. Is a variable that was used to build the model no longer available in the environment when plotmo is called?oError: this object is not supported by plotmo

Plotmo's default methods are insufficient for your model object. See Extending plotmo above (and contact the author --- this is often easy to fix).

FAQ

o I am not seeing any interaction plots. How can I change that?

Use all2=TRUE. By default, degree2 plots are drawn only for some types of model. See the section Which variables are plotted?.

o The persp display is unnaturally jagged. How can I change that?

Use clip=FALSE. The jaggedness is probably an artifact of the way persp works at the boundaries. You can also try increasing ngrid2.

o The image display has blue holes in it. What gives?

The holes are areas where the predicted response is out-of-range. Try using clip=FALSE.

o I want to add lines or points to a plot created by plotmo. and am having trouble getting my axis scaling right. Help?

Use do.par=FALSE or do.par=2. With the default do.par=TRUE, plotmo restores the par parameters and axis scales to their values before plotmo was called.

See Also

There is section on plotmo in the rpart.plot vignette http://www.milbo.org/rpart-plot/prp.pdf{Plotting rpart trees with prp}.

Aliases
  • plotmo
Examples
if (require(rpart)) {
    data(kyphosis)
    rpart.model <- rpart(Kyphosis~., data=kyphosis)
    plotmo(rpart.model, type="prob", nresponse="present")
}
if (require(earth)) {
    data(ozone1)
    earth.model <- earth(O3 ~ ., data=ozone1, degree=2, nk=21)
    plotmo(earth.model)
}
Documentation reproduced from package plotmo, version 2.0.0, License: GPL-3

Community examples

Looks like there are no examples yet.