# geom_boxplot

##### Box and whiskers plot.

The upper and lower "hinges" correspond to the first and
third quartiles (the 25th and 75th percentiles). This
differs slightly from the method used by the
`boxplot`

function, and may be apparent with small
samples. See `boxplot.stats`

for for more
information on how hinge positions are calculated for
`boxplot`

.

##### Usage

```
geom_boxplot(mapping = NULL, data = NULL,
stat = "boxplot", position = "dodge",
outlier.colour = "black", outlier.shape = 16,
outlier.size = 2, notch = FALSE, notchwidth = 0.5, ...)
```

##### Arguments

- outlier.colour
- colour for outlying points
- outlier.shape
- shape of outlying points
- outlier.size
- size of outlying points
- notch
- if
`FALSE`

(default) make a standard box plot. If`TRUE`

, make a notched box plot. Notches are used to compare groups; if the notches of two boxes do not overlap, this is strong evidence that the medians differ. - notchwidth
- for a notched box plot, width of the notch relative to the body (default 0.5)
- mapping
- The aesthetic mapping, usually constructed
with
`aes`

or`aes_string`

. Only needs to be set at the layer level if you are overriding the plot defaults. - data
- A layer specific dataset - only needed if you want to override the plot defaults.
- stat
- The statistical transformation to use on the data for this layer.
- position
- The position adjustment to use for overlappling points on this layer
- ...
- other arguments passed on to
`layer`

. This can include aesthetics whose values you want to set, not map. See`layer`

for more details.

##### Details

The upper whisker extends from the hinge to the highest value that is within 1.5 * IQR of the hinge, where IQR is the inter-quartile range, or distance between the first and third quartiles. The lower whisker extends from the hinge to the lowest value within 1.5 * IQR of the hinge. Data beyond the end of the whiskers are outliers and plotted as points (as specified by Tukey).

In a notched box plot, the notches extend ```
1.58 *
IQR / sqrt(n)
```

. This gives a roughly 95 interval for comparing medians. See McGill et al. (1978)
for more details.

##### Aesthetics

p + geom_boxplot() qplot(factor(cyl), mpg, data = mtcars, geom = "boxplot")

p + geom_boxplot() + geom_jitter() p + geom_boxplot() + coord_flip() qplot(factor(cyl), mpg, data = mtcars, geom = "boxplot") + coord_flip()

p + geom_boxplot(notch = TRUE) p + geom_boxplot(notch = TRUE, notchwidth = .3)

p + geom_boxplot(outlier.colour = "green", outlier.size = 3)

# Add aesthetic mappings # Note that boxplots are automatically dodged when any aesthetic is # a factor p + geom_boxplot(aes(fill = cyl)) p + geom_boxplot(aes(fill = factor(cyl))) p + geom_boxplot(aes(fill = factor(vs))) p + geom_boxplot(aes(fill = factor(am)))

# Set aesthetics to fixed value p + geom_boxplot(fill = "grey80", colour = "#3366FF") qplot(factor(cyl), mpg, data = mtcars, geom = "boxplot", colour = I("#3366FF"))

# Scales vs. coordinate transforms ------- # Scale transformations occur before the boxplot statistics are computed. # Coordinate transformations occur afterwards. Observe the effect on the # number of outliers. library(plyr) # to access round_any m <- ggplot(movies, aes(y = votes, x = rating, group = round_any(rating, 0.5))) m + geom_boxplot() m + geom_boxplot() + scale_y_log10() m + geom_boxplot() + coord_trans(y = "log10") m + geom_boxplot() + scale_y_log10() + coord_trans(y = "log10")

# Boxplots with continuous x: # Use the group aesthetic to group observations in boxplots qplot(year, budget, data = movies, geom = "boxplot") qplot(year, budget, data = movies, geom = "boxplot", group = round_any(year, 10, floor))

# Using precomputed statistics # generate sample data abc <- adply(matrix(rnorm(100), ncol = 5), 2, quantile, c(0, .25, .5, .75, 1)) b <- ggplot(abc, aes(x = X1, ymin = `0%`, lower = `25%`, middle = `50%`, upper = `75%`, ymax = `100%`)) b + geom_boxplot(stat = "identity") b + geom_boxplot(stat = "identity") + coord_flip() b + geom_boxplot(aes(fill = X1), stat = "identity")

`stat_quantile`

to view quantiles conditioned
on a continuous variable, `geom_jitter`

for
another way to look at conditional distributions"*Documentation reproduced from package ggplot2, version 0.9.3.1, License: GPL-2*