Learn R Programming

toaster (version 0.5.5)

createBoxplot: Create box plot.

Description

Create box plot visualization using quartiles calculated with computePercentiles. The simplest case without x value displays single boxplot from the single set of percentiles. To plot multiple box plots and multiple or single box plots with facets use parameters x and/or facet.

Usage

createBoxplot(data, x = NULL, fill = x, value = "value", useIQR = FALSE, facet = NULL, ncol = 1, facetScales = "fixed", paletteValues = NULL, palette = "Set1", title = paste("Boxplots", ifelse(is.null(x), NULL, paste("by", x))), subtitle = NULL, xlab = x, ylab = NULL, legendPosition = "right", fillGuide = "legend", coordFlip = FALSE, baseSize = 12, baseFamily = "sans", defaultTheme = theme_tufte(base_size  = baseSize, base_family = baseFamily), themeExtra = NULL)

Arguments

data
quartiles precomputed with computePercentiles
x
column name of primary variance. Multiple boxplots are placed along the x-axis. Each value of x must have corresponding percentiles calculated.
fill
name of a column with values to colour box plots
value
column name with percentile value. Usually default 'value' with exception of temporal percentiles that should use 'epoch' value.
useIQR
logical indicates use of IQR interval to compute cutoff lower and upper bounds: [Q1 - 1.5 * IQR, Q3 + 1.5 * IQR], IQR = Q3 - Q1, if FALSE then use maximum and minimum bounds (all values).
facet
vector of 1 or 2 column names to split up data to plot the subsets as facets. If single name then subset plots are placed next to each other, wrapping with ncol number of columns (uses facet_wrap). When two names then subset plots vary on both horizontal and vertical directions (grid) based on the column values (uses facet_grid).
ncol
number of facet columns (applies when single facet column supplied only - see parameter facet).
facetScales
Are scales shared across all subset plots (facets): "fixed" - all are the same, "free_x" - vary across rows (x axis), "free_y" - vary across columns (Y axis, default), "free" - both rows and columns (see in facet_wrap parameter scales )
paletteValues
actual palette colours for use with scale_fill_manual (if specified then parameter palette is ignored)
palette
Brewer palette name - see display.brewer.all in RColorBrewer package for names
title
plot title.
subtitle
plot subtitle.
xlab
a label for the x axis, defaults to a description of x.
ylab
a label for the y axis, defaults to a description of y.
legendPosition
the position of legends. ("left", "right", "bottom", "top", or two-element numeric vector). "none" is no legend.
fillGuide
Name of guide object, or object itself for the fill (when present). Typically "legend" name or object guide_legend.
coordFlip
logical flipped cartesian coordinates so that horizontal becomes vertical, and vertical horizontal (see coord_flip).
baseSize
theme base font size
baseFamily
theme base font family
defaultTheme
plot theme settings with default value theme_tufte. More themes are available here: ggtheme (by ggplot2) and ggthemes.
themeExtra
any additional theme settings that override default theme.

Value

ggplot object

Details

Multiple box plots: x is a name of variable where each value corresponds to a set of percentiles. The boxplots will be placed along the x-axis. Simply use computePercentiles with parameter by="name to be passed in x variable".

Facets: facet vector contains one or two names of vairables where each combination of values corresponds to a set of percentiles. The boxplot(s) will be placed inside separate sections of the plot (facets). Both single boxplot (without variable x and with one) are supported.

Usually, with multiple percentile sets varying along single value use parameter x and add facets on top. The exception is when scale of percentile values differs between each boxplot. Then omit parameter x and use facet with facetScales='free_y'.

See Also

computePercentiles for computing boxplot quartiles

Examples

Run this code
if(interactive()){
# initialize connection to Lahman baseball database in Aster 
conn = odbcDriverConnect(connection="driver={Aster ODBC Driver};
                         server=<dbhost>;port=2406;database=<dbname>;uid=<user>;pwd=<pw>")

# boxplot of pitching ipouts for AL in 2000s
ipop = computePercentiles(conn, "pitching", columns="ipouts")
createBoxplot(ipop)
                          
# boxplots by the league of pitching ipouts
ipopLg = computePercentiles(conn, "pitching", columns="ipouts", by="lgid")
createBoxplot(ipopLg, x="lgid")

# boxplots by the league with facet yearid of pitching ipouts in 2010s
ipopLgYear = computePercentiles(conn, "pitching", columns="ipouts", by=c("lgid", "yearid"),
                                where = "yearid >= 2010")
createBoxplot(ipopLgYear, x="lgid", facet="yearid", ncol=3)

# boxplot with facets only
bapLgDec = computePercentiles(conn, "pitching_enh", columns="era", by=c("lgid", "decadeid"),
                              where = "lgid in ('AL','NL')")
createBoxplot(bapLgDec, facet=c("lgid", "decadeid"))
}

Run the code above in your browser using DataLab