Learn R Programming

tabplot (version 0.10-1)

tableplot: Visualization of large multivariate datasets.

Description

Visualization of large multivariate datasets.

Usage

tableplot(dat, colNames=names(dat), sortCol=1, decreasing=TRUE, scales="auto",
    pals=list(1, 9, 3, 10), nBins=100, from=0, to=100,
    bias_brokenX=0.8, IQR_bias=5, plot=TRUE, ...)

Arguments

dat
a data.frame, data.table, or an ffdf object (required)
colNames
character vector containing the names of the columns of dat that are visualized in the tablelplot. If omitted, all columns are visualized. All selected columns should be of class: numeric, integer, factor, or logical.
sortCol
columns that are sorted. sortCol is either a vector of column names of a vector of indices of colNames
decreasing
determines whether the columns are sorted decreasingly (TRUE) of increasingly (FALSE). decreasing can be either a single value that applies to all sorted columns, or a vector of the same length as sortCol.
scales
determines the horizontal axes of the numeric variables in colNames, options: "lin", "log", and "auto" for automatic detection. If necessary, scales is recycled.
pals
list of color palettes. Each list item is on of the following:
  • a index number between 1 and 16. In this case, the default palette is used with the index number being the first color that is used.
  • a palette name in
nBins
number of row bins
from
percentage from which the data is shown
to
percentage to which the data is shown
bias_brokenX
parameter between 0 en 1 that determines when the x-axis of a numeric variable is broken. If minimum value is at least bias_brokenX times the maximum value, then X axis is broken. To turn off broken x-axes, set bias_brokenX=1.
IQR_bias
parameter that determines when a logarithmic scale is used when scales is set to "auto". The argument IQR_bias is multiplied by the interquartile range as a test.
plot
boolean, to plot or not to plot a tableplot
...
arguments passed to plot.tabplot

Value

Details

A tableplot is a visualisation of a (large) multivariate dataset. Each column represents a variable and each row bin is an aggregate of a certain number of records. For numeric variables, a bar chart of the mean values is depicted. For categorical variables, a stacked bar chart is depicted of the proportions of categories. Missing values are taken into account. Also supports large ffdf datasets from the ff package. Use tableGUI to customize this function with a GUI.

Examples

Run this code
# load diamonds dataset from ggplot2
require(ggplot2)
data(diamonds)

# default tableplot
tableplot(diamonds)

# customized tableplot
tableplot(diamonds, colNames=c("carat", "cut", "color", "clarity", "price"), sortCol="price", from=0, to=5)

Run the code above in your browser using DataLab