scatterplotMatrix: Scatterplot Matrices

Description

Enhanced scatterplot matrices with univariate displays down the diagonal; spm is an abbreviation for scatterplotMatrix. This function just sets up a call to pairs with custom panel functions.

Usage

scatterplotMatrix(x, ...)

## S3 method for class 'formula':
scatterplotMatrix(formula, data=NULL, subset, labels, ...)

## S3 method for class 'default':
scatterplotMatrix(x, var.labels=colnames(x), 
    diagonal=c("density", "boxplot", "histogram", "oned", "qqplot", "none"), 
    adjust=1, nclass,
    plot.points=TRUE, smoother=loessLine, smoother.args=list(), smooth, span,
    spread = !by.groups, reg.line=lm,
    transform=FALSE, family=c("bcPower", "yjPower"),
    ellipse=FALSE, levels=c(.5, .95), robust=TRUE,
    groups=NULL, by.groups=FALSE, 
    use=c("complete.obs", "pairwise.complete.obs"),
    labels, id.method="mahal", id.n=0, id.cex=1, id.col=palette()[1],
    col=if (n.groups == 1) palette()[3:1] else rep(palette(), length=n.groups),
    pch=1:n.groups, lwd=1, lty=1, 
    cex=par("cex"), cex.axis=par("cex.axis"), cex.labels=NULL, 
    cex.main=par("cex.main"), 
    legend.plot=length(levels(groups)) > 1, legend.pos=NULL, row1attop=TRUE, ...)

spm(x, ...)

Arguments

a data matrix, numeric data frame.

formula

a one-sided model formula, of the form ~ x1 + x2 + ... + xk or ~ x1 + x2 + ... + xk | z where z evaluates to a factor or other variable to divide the data into groups.

data

for scatterplotMatrix.formula, a data frame within which to evaluate the formula.

subset

expression defining a subset of observations.

labels,id.method,id.n,id.cex,id.col

Arguments for the labelling of points. The default is id.n=0 for labeling no points. See showLabels for details of these arguments. If the plot uses different colors for gr

var.labels

variable labels (for the diagonal of the plot).

diagonal

contents of the diagonal panels of the plot. If plotting by groups, a different univariate display (with the exception of "histogram") will be drawn for each group.

adjust

relative bandwidth for density estimate, passed to density function.

nclass

number of bins for histogram, passed to hist function.

plot.points

if TRUE the points are plotted in each off-diagonal panel.

smoother

a function to draw a nonparametric-regression smooth; the default is gamLine, which uses the gam function in the mgcv package. For this and o

smoother.args

a list of named values to be passed to the smoother function; the specified elements of the list depend upon the smoother (see ScatterplotSmoothers).

smooth, span

these arguments are included for backwards compatility: if smooth=TRUE then smoother is set to loessLine, and if span is specified, it is added to smoother.args.

spread

if TRUE, estimate the (square root) of the variance function. For loessLine and for gamLine, this is done by separately smoothing the squares of the postive and negative residuals from the mean fit, and then adding the

reg.line

if not FALSE a line is plotted using the function given by this argument; e.g., using rlm in package MASS plots a robust-regression line.

transform

if TRUE, multivariate normalizing power transformations are computed with powerTransform, rounding the estimated powers to `nice' values for plotting; if a vector of powers, o

family

family of transformations to estimate: "bcPower" for the Box-Cox family or "yjPower" for the Yeo-Johnson family (see powerTransform).

ellipse

if TRUE data-concentration ellipses are plotted in the off-diagonal panels.

levels

levels or levels at which concentration ellipses are plotted; the default is c(.5, .9).

robust

if TRUE use the cov.trob function in the MASS package to calculate the center and covariance matrix for the data ellipses.

groups

a factor or other variable dividing the data into groups; groups are plotted with different colors and plotting characters.

by.groups

if TRUE, regression lines are fit by groups.

use

if "complete.obs" (the default), cases with missing data are omitted; if

"pairwise.complete.obs"), all valid cases are used
    in each panel of the plot.

pch

plotting characters for points; default is the plotting characters in order (see par).

col

colors for lines and points; the default is taken from the color palette, with palette()[3] for linear regression lines, palette()[2] for nonparametric regression lines, and palette()[1] for points if there

lwd

width of linear-regression lines (default 1).

lty

type of linear-regression lines (default 1, solid line).

cex, cex.axis, cex.labels, cex.main

set sizes of various graphical elements (see par).

legend.plot

if TRUE then a legend for the groups is plotted in the first diagonal cell.

legend.pos

position for the legend, specified as one of the keywords accepted by legend. If NULL, the default, the position will vary by the diagonal argument --- e.g., "topri

row1attop

If TRUE (the default) the first row is at the top, as in a matrix, as opposed to at the bottom, as in graph (argument suggested by Richard Heiberger).

...

arguments to pass down.

Value

NULL. This function is used for its side effect: producing a plot.

References

Fox, J. and Weisberg, S. (2011) An R Companion to Applied Regression, Second Edition, Sage.

Examples

Run this code

scatterplotMatrix(~ income + education + prestige | type, data=Duncan)
scatterplotMatrix(~ income + education + prestige, 
    transform=TRUE, data=Duncan, smoother=loessLine)
scatterplotMatrix(~ income + education + prestige | type, smoother=FALSE, 
	by.group=TRUE, transform=TRUE, data=Duncan)

Run the code above in your browser using DataLab