Learn R Programming

tabplot (version 0.12)

tableplot: Create a tableplot

Description

A tableplot is a visualisation of (large) multivariate datasets. Each column represents a variable and each row bin is an aggregate of a certain number of records. For numeric variables, a bar chart of the mean values is depicted. For categorical variables, a stacked bar chart is depicted of the proportions of categories. Missing values are taken into account. Also supports large ffdf datasets from the ff package.

Usage

tableplot(dat, select, subset = NULL, sortCol = 1,
    decreasing = TRUE, nBins = 100, from = 0, to = 100,
    nCols = ncol(dat), scales = "auto",
    pals = list("Set1", "Set2", "Set3", "Set4"),
    colorNA = "#FF1414", numPals = "Blues",
    bias_brokenX = 0.8, IQR_bias = 5, select_string = NULL,
    subset_string = NULL, colNames = NULL, filter = NULL,
    plot = TRUE, ...)

Arguments

dat
a data.frame, data.table, or an ffdf object (required).
select
expression indicating the columns of dat that are visualized in the tablelplot Also column indices are supported. By default, all columns are visualized. Use select_string for character strings instead of expressions.
subset
logical expression indicing which rows to select in dat (as in subset). It is also possible to provide the name of a categorical variable: then, a tableplot for each category is gene
sortCol
expression indication the column(s) that is(are) sorted. Also supports indices. Also character strings can be used, but this is discouraged for programming purposes (use indices instead).
decreasing
determines whether the columns are sorted decreasingly (TRUE) of increasingly (FALSE). decreasing can be either a single value that applies to all sorted columns, or a vector of the same length as s
nBins
number of row bins
from
percentage from which the data is shown
to
percentage to which the data is shown
nCols
the maximum number of columns per tableplot. If this number is smaller than the number of columns selected in datNames, multiple tableplots are generated, where each of them contains the sorted column(s).
scales
determines the horizontal axes of the numeric variables in colNames, options: "lin", "log", and "auto" for automatic detection. If necessary, scales is recycled.
pals
list of color palettes. Each list item is on of the following:
  • a palette name intablePalettes, optionally with the starting color between brackets.
  • a palette vector
The
colorNA
color for missing values
numPals
name(s) of the palette(s) that is(are) used for numeric variables ("Blues", "Greys", or "Greens"). Recycled if necessary.
bias_brokenX
parameter between 0 en 1 that determines when the x-axis of a numeric variable is broken. If minimum value is at least bias_brokenX times the maximum value, then X axis is broken. To turn off broken x-axes, set bias_brokenX=
IQR_bias
parameter that determines when a logarithmic scale is used when scales is set to "auto". The argument IQR_bias is multiplied by the interquartile range as a test.
select_string
character equivalent of the select argument (particularly useful for programming purposes)
subset_string
character equivalent of the subset argument (particularly useful for programming purposes)
colNames
deprecated; used in older versions of tabplot (prior to 0.12): use select_string) instead
filter
deprecated; used in older versions of tabplot (prior to 0.12): use subset_string) instead
plot
boolean, to plot or not to plot a tableplot
...
arguments passed to plot.tabplot

Value

  • tabplot-object (silent output). If multiple tableplots are generated (which can be done by either setting subset to a categorical column name, or by restricting the number of columns with nCols), then a list of tabplot-objects is silently returned.

Examples

Run this code
# load diamonds dataset from ggplot2
require(ggplot2)
data(diamonds)

# default tableplot
tableplot(diamonds)

# customized tableplot
tableplot(diamonds, select=c(carat, cut, color, clarity, price), sortCol=price, from=0, to=5)

# apply filter
tableplot(diamonds, subset=price < 5000 & cut=='Premium')
tableplot(diamonds, subset=cut)

Run the code above in your browser using DataLab