gpairs (version 1.3.3)

gpairs: Generalized Pairs Plots

Description

Produces a matrix of plots showing pairwise relationships between quantitative and categorical variables in a complex data set.

Usage

gpairs(x,
       upper.pars = list(scatter = "points",
                         conditional = "barcode",
                         mosaic = "mosaic"),
       lower.pars = list(scatter = "points",
                         conditional = "boxplot",
                         mosaic = "mosaic"),
       diagonal = "default",
       outer.margins = list(bottom = unit(2, "lines"), 
                            left = unit(2, "lines"), 
                            top = unit(2, "lines"), 
                            right = unit(2, "lines")), 
       xylim = NULL,
       outer.labels = NULL, outer.rot = c(0, 90), gap = 0.05, 
       buffer = 0.02, reorder = NULL, cluster.pars = NULL, 
       stat.pars = NULL, scatter.pars = NULL, 
       bwplot.pars = NULL, stripplot.pars = NULL, barcode.pars=NULL,
       mosaic.pars = NULL, axis.pars = NULL, diag.pars = NULL, 
       whatis = FALSE)

corrgram(x)

Arguments

x

a data frame (or matrix the relationships between whose columns are to be examined). Any combination of quantitative and categorical variables is acceptable.

upper.pars

see Details

lower.pars

see Details

diagonal

by default, the diagonal from the top left to the bottom right is used for displaying the variable names (and, in our version, the marginal distributions of the variables); diagonal="other" will place the diagonal running from the top right down to the bottom left.

outer.margins

a list of length 4 with units as components named bottom, left, top, and right, giving the outer margins; the default uses two lines of text. A vector of length 4 with units (ordered properly) will work, as will a vector of length 4 with numeric values (interpreted as lines).

xylim

optionally specify a single range to be used as xlim and ylim where appropriate. Note that if this option causes cropping, it will fail to work in barcode panels.

outer.labels

the default is NULL, for alternating axis labels around the perimeter. If "all", all labels are printed, and if "none" no labels are printed.

outer.rot

a 2-vector (x, y) rotating the top/bottom outer labels x degrees and the left/right outer labels y degrees. Only works for categorical labels of boxplot and mosaic panels.

gap

the gap between the tiles; defaulting to 0.05 of the width of a tile.

buffer

the fraction by which to expand the range of quantitative variables to provide plots that will not truncate plotting symbols. Defaults to 0 percent of range currently.

reorder

currently only support for the string "cluster", which reorders the columns according to the output of hclust. Note that factors are coerced to numbers by replacing them with integers, which implicitly assumes what is probably an arbitrary ordering.

cluster.pars

a list with two elements named dist.method and hclust.method. These are passed respectively to dist and hclust. NULL is equivalent to list(dist.method = "euclidean", hclust.method = "complete").

stat.pars

NULL is equivalent to list(fontsize = 7, signif = 0.05, verbose = FALSE, use.color = TRUE, missing = 'missing', just = 'centre'); stat.pars\$verbose can be TRUE (providing 5 statistics), FALSE (providing 2 statistics), or NA (nothing). The string missing is used in summaries where there are missing values; fontsize and just control the size and justification of the text summaries (see grid.text and gpar. The use.color=FALSE option provides an alternative summary of the strength of the correlation (see Green and Hickey (2006)). This is only used with scatter="stats") in upper.pars and lower.pars.

scatter.pars

NULL is equivalent to list(pch = 1, size = unit(0.25, "char"), col = "black", frame.fill = NULL, border.col = "black").

bwplot.pars

NULL, passed to bwplot for producing boxplots.

stripplot.pars

NULL is equivalent to list(pch = 1, size = unit(0.5, 'char'), col = 'black', jitter = FALSE).

barcode.pars

NULL is equivalent to list(nint = 0, ptsize = unit(0.25, "char"), ptpch = 1, bcspace = NULL, use.points = FALSE).

mosaic.pars

NULL. Currently shade, gp_labels, gp, and gp_args are passed through to strucplot for producing mosaic tiles.

axis.pars

NULL is equivalent to list(n.ticks = 5, fontsize = 9).

diag.pars

NULL is equivalent to list(fontsize = 9, show.hist = TRUE, hist.color = 'black').

whatis

default is FALSE; TRUE returns whatis(x).

Value

If whatis=TRUE, the value is a data frame containing variable names, types, numbers of missing values, numbers of distinct values, precisions, maxima and minima.

Details

In some cases, the graphics device can not be resized after production of the plot because of the way rotation of barcodes is performed.

upper.pars and lower.pars are lists possibly containing named elements 'scatter', 'conditional' and 'mosaic'. Each element of the list is a string implementing the following options: scatter = exactly one of ('points', 'lm', 'ci', 'symlm', 'loess', 'corrgram', 'stats', 'qqplot'); 'conditional' = exactly one of ('boxplot', 'stripplot', 'barcode'); mosaic='mosaic' (only option currently implemented).

corrgram() is just a wrapper to gpairs() producing a `corrgram' in the style of Michael Friendly.

References

Emerson, John W. (1998) "Mosaic Displays in S-PLUS: A General Implementation and a Case Study." Statistical Computing and Graphics Newsletter Vol. 9,No. 1, 1998.

Basford, K. E. and J. W. Tukey (1999) Graphical Analysis of Multiresponse Data: Illustrated with a Plant Breeding Trial.

Friendly, M. (2000). Visualizing Categorical Data. SAS Press.

Friendly, M., 2002, "Corrgrams: Exploratory displays for correlation matrices." American Statistician 56(4), 316--324.

Green, W. A. (2006) "Loosening the CLAMP: An exploratory graphical approach to the Climate Leaf Analysis Multivariate Program." Palaeontologia Electronica 9(2):9A.

See Also

pairs, splom, mosaicplot, strucplot, bwplot, barcode, stripplot.

Examples

Run this code
# NOT RUN {
allexamples <- FALSE

y <- data.frame(A=c(rep("red", 100), rep("blue", 100)),
                B=c(rnorm(100),round(rnorm(100,5,1),1)), C=runif(200),
                D=c(rep("big", 150), rep("small", 50)),
                E=rnorm(200), stringsAsFactors=TRUE)
gpairs(y)

data(iris)
gpairs(iris)
if (allexamples) {
  gpairs(iris, upper.pars = list(scatter = 'stats'),
         scatter.pars = list(pch = substr(as.character(iris$Species), 1, 1),
                             col = as.numeric(iris$Species)),
         stat.pars = list(verbose = FALSE))
  gpairs(iris, lower.pars = list(scatter = 'corrgram'),
         upper.pars = list(conditional = 'boxplot', scatter = 'loess'),
         scatter.pars = list(pch = 20))
}

data(Leaves)
gpairs(Leaves[1:10], lower.pars = list(scatter = 'loess'))
if (allexamples) {
  gpairs(Leaves[1:10], upper.pars = list(scatter = 'stats'),
         lower.pars = list(scatter = 'corrgram'),
         stat.pars = list(verbose = FALSE), gap = 0)
  corrgram(Leaves[,-33])
}

runexample <- FALSE
if (runexample) {
  data(NewHavenResidential)
  gpairs(NewHavenResidential)
}

# }

Run the code above in your browser using DataCamp Workspace