plotmo: Plot a model's response over a range of predictor values

Description

Plot a model's response when varying one or two predictors while holding the other predictors constant. A poor man's partial dependence plot. See the Details section for an overview.

Usage

plotmo(object = stop("no 'object' arg"),
       type=NULL, nresponse = NA, clip = TRUE, ylim = NULL,
       center = FALSE, ndiscrete = 5,
       degree1 = TRUE, all1=FALSE, degree2 = TRUE, all2=FALSE, int.only.ok = TRUE,
       grid.func = median, grid.levels = NULL,
       col.response = 0, cex.response = NULL, pch.response = 20,
       jitter.response=0, npoints = -1, inverse.func = NULL,
       level = 0, shade.pints = "mistyrose2", shade2.pints = "mistyrose4",
       trace = 0,
       nrug = 0, col.degree1 = 1, lty.degree1 = 1, lwd.degree1 = 1,
       col.smooth = 0, lty.smooth = 1, lwd.smooth = 1, smooth.f = .5,
       func = NULL, col.func = "lightblue", lty.func = 1, lwd.func = 1,
       ngrid1 = 50,  grid = FALSE,
       type2 = "persp", ngrid2 = 20,
       col.image = gray(0:10/10), col.persp = "lightblue",
       theta = NA, phi = 30, dvalue = 1, shade = 0.5,
       do.par = TRUE, caption = NULL, main = NULL,
       xlim = NULL, xlab = "", ylab = "", cex = NULL, cex.lab = 1,
       xflip = FALSE, yflip = FALSE, swapxy = FALSE, se = 0, ...)

concept

partial dependence plot

Limitations

NAs are not yet supported. To prevent confusing error messages from functions called by plotmo, it is safest to remove NAs before building your model. (However, rpart models are treated specially by plotmo. For these, plotmo predicts with na.pass so plotmo can be used with rpart's default NA handling.)

Keep the variable names in the original model formula simple. Use temporary variables or attach rather than using $ and similar in formulas.

Plotmo evaluates the model data in the environment used when the model was built, if that environment was saved with the model (typically this is the case if the formula interface was used to the model function). If the environment was not saved with the model (typically if the x,y interface was used), the model data is evaluated in the environment in which plotmo is called.

Alternatives

An alternative approach is to use partial-dependence plots (e.g. The Elements of Statistical Learning 10.13.2). Plotmo sets the other variables to their median value, whereas in a partial-dependence plot at each plotted point the effect of the other variables is averaged. In general, partial-dependence plots and plotmo plots will differ, but for additive models the shape of the curves will match identically. Eventually plotmo will be enhanced to draw partial-dependence plots. Termplot is effective but can be used only on models with a predict method that supports type="terms", and it does not generate degree2 plots.

Some other possibilites for plotting the response on a per-predictor basis are partial-residual plots, partial-regression variable plots, and marginal-model plots (e.g. crPlots, avPlots, and marginalModelPlot in the car-package). These plots are orientated towards linear models. The effects-package package is also of interest.

The <code>clip</code> argument

With the default clip=TRUE, predicted values out of the expected range are not displayed.

Generally, the expected range is the range of the response y used when building the model. But that depends on the type of model, and plotmo knows about some special cases. For example, it knows that for some models we are predicting a probability, and it scales the axes accordingly, 0 to 1. However, plotmo cannot know about every possible model and prediction type, and will sometimes determine the expected response range incorrectly. In that case use clip=FALSE.

The default clip is TRUE because it is a useful sanity check to test that the predicted values are in the expected range. While not necessarily an error, predictions outside the expected range are usually something we want to know about. Also, with clip=FALSE, a few errant predictions can expand the entire y-axis, making it difficult to see the shape of the other predictions.

Using <code>plotmo</code> on various models

Here are some examples which illustrate plotmo on various objects. (The models here are just for illustrating plotmo and shouldn't be taken too seriously.) # use a small set of variables for illustration library(earth) # for ozone1 data data(ozone1) oz <- ozone1[, c("O3", "humidity", "temp", "ibt")]

lm.model <- lm(O3 ~ humidity + temp*ibt, data=oz) # linear model plotmo(lm.model, col.response="gray", nrug=-1)

library(rpart) # rpart rpart.model <- rpart(O3 ~ ., data=oz) plotmo(rpart.model, all2=TRUE)

library(randomForest) # randomForest rf.model <- randomForest(O3~., data=oz) plotmo(rf.model) # partialPlot(rf.model, oz, temp) # compare to partial-dependence plot

library(gbm) # gbm gbm.model <- gbm(O3~., data=oz, dist="gaussian", inter=2, n.trees=1000) plotmo(gbm.model) # plot(gbm.model, i.var=2) # compare to partial-dependence plots # plot(gbm.model, i.var=c(2,3))

library(mgcv) # gam gam.model <- gam(O3 ~ s(humidity)+s(temp)+s(ibt)+s(temp,ibt), data=oz) plotmo(gam.model, level=.95, all2=TRUE)

library(nnet) # nnet set.seed(4) nnet.model <- nnet(O3~., data=scale(oz), size=2, decay=0.01, trace=FALSE) plotmo(nnet.model, type="raw", all2=T)

library(MASS) # qda lcush <- data.frame(Type=as.numeric(Cushings$Type),log(Cushings[,1:2])) lcush <- lcush[1:21,] qda.model <- qda(Type~., data=lcush) plotmo(qda.model, type="class", all2=TRUE, type2="contour", ngrid2=100, nlevels=2, drawlabels=FALSE, col.response=as.numeric(lcush$Type)+1, pch.response=as.character(lcush$Type))

Prediction intervals

Use the level argument to plot pointwise confidence or prediction intervals. The predict method of the object must support this. Examples: par(mfrow=c(2,3)) log.trees <- log(trees) # make the resids more homoscedastic # (necessary for lm)

# lm lm.model <- lm(Volume~Height, data=log.trees) plot(lm.model, which=1, main="lm") # Residual vs Fitted graph plotmo(lm.model, level=.90, col.response=2, main="lm\n(conf and pred intervals)", do.par=F)

# earth (requires earth 3.3) library(earth) earth.model <- earth(Volume~Height, data=log.trees, nfold=5, ncross=30, varmod.method="lm") plotmo(earth.model, level=.90, col.response=2, main="earth", do.par=F)

# quantreg library(quantreg) rq.model <- rq(Volume~Height, data=log.trees, tau=c(.05, .5, .95)) plotmo(rq.model, level=.90, col.response=2, main="rq", do.par=F)

# quantregForest # quantregForest is a layer on randomForest that allows prediction intervals library(quantregForest) x <- data.frame(Height=log.trees$Height) qrf <- quantregForest(x, log.trees$Volume) plotmo(qrf, level=.90, col.response=2, main="qrf", do.par=F)

# gam library(mgcv) gam.model <- gam(Volume~s(Height), data=log.trees) plotmo(gam.model, level=.90, col.response=2, main="gam\n(conf not pred intervals)", do.par=F)

These are pointwise limits. They can only be interpreted in a pointwise fashion. So for non-parametric models they shouldn't be used to infer bumps or dips that are dependent on a range of the curve. For that you need simultaneous confidence bands, which none of the models above support.

Common error messages

oError in match.arg(type): 'arg' should be one of ...

The message is probably issued by the predict method for your model object. Set type to a legal value as described on the help page for the predict method for your object.oError: predicted values are out of ylim, try clip=FALSE

Probably plotmo has incorrectly determined the expected range of the response, and hence also ylim. Re-invoke plotmo with clip=FALSE. See the section The clip argument.oError: predict.lm(xgrid, type="response") returned the wrong length

oWarning: 'newdata' had 100 rows but variable(s) found have 30 rows

oError: variable 'x' was fitted with type "nmatrix.2" but type "numeric" was supplied

oError in model.frame: invalid type (list) for variable 'x[,3]'

These and similar messages usually mean that predict is misinterpreting the new data generated by plotmo.

The underlying issue is that many predict methods, including predict.lm, seem to reject any reasonably constructed new data if the function used to create the model was called in an unconventional way. The work-around is to simplify or standardize the way the model function is called. Use a formula and a data frame, or at least explicitly name the variables rather than passing a matrix. Use simple variable names (so x1 rather than dat$x1, for example).

If the symptoms persist after changing the way the model is called, and the model is not one of those listed in Which variables are plotted, it is possible that the model class is not supported by plotmo. See Extending plotmo.oError: get.plotmo.x.default cannot get the x matrix

This and similar messages mean that plotmo cannot get the data it needs from the model object.

You can try simplifying and standardizing the way the model function is called, as described above. Perhaps you need to use keepxy or similar in the call to the model function, so the data is attached to the object and available for plotmo. Is a variable that was used to build the model no longer available in the environment when plotmo is called?oError: this object is not supported by plotmo

Plotmo's default methods are insufficient for your model object. See Extending plotmo above (and contact the author --- this is often easy to fix).

FAQ

o I am not seeing any interaction plots. How can I change that?

Use all2=TRUE. By default, degree2 plots are drawn only for some types of model. See the section Which variables are plotted?.

o The persp display is unnaturally jagged. How can I change that?

Use clip=FALSE. The jaggedness is probably an artifact of the way persp works at the boundaries. You can also try increasing ngrid2.

o The image display has blue holes in it. What gives?

The holes are areas where the predicted response is out-of-range. Try using clip=FALSE.

o I want to add lines or points to a plot created by plotmo. and am having trouble getting my axis scaling right. Help?

Use do.par=FALSE or do.par=2. With the default do.par=TRUE, plotmo restores the par parameters and axis scales to their values before plotmo was called.

Details

Plotmo can be used on a wide variety of regression models. It plots a degree1 (main effect) plot by calling predict to predict the response when changing one variable while holding all other variables at their median values. For degree2 (interaction) plots, two variables are changed while holding others at their medians. The first level is used instead of the median for factors. You can change this with the grid.func and grid.levels arguments.

Each graph shows only a thin slice of the data because most variables are fixed. Please be aware of that when interpreting the graph --- over-interpretation is a temptation.

Plotmo was originally part of the earth package and a few connections to that package still remain.

Examples

Run this code

if (require(rpart)) {
    data(kyphosis)
    rpart.model <- rpart(Kyphosis~., data=kyphosis)
    plotmo(rpart.model, type="prob", nresponse="present")
}
if (require(earth)) {
    data(ozone1)
    earth.model <- earth(O3 ~ ., data=ozone1, degree=2, nk=21)
    plotmo(earth.model)
}

Run the code above in your browser using DataLab