boxplot
function, and may be apparent with small samples.
See boxplot.stats
for for more information on how hinge
positions are calculated for boxplot
.
geom_boxplot(mapping = NULL, data = NULL, stat = "boxplot", position = "dodge", ..., outlier.colour = NULL, outlier.color = NULL, outlier.shape = 19, outlier.size = 1.5, outlier.stroke = 0.5, notch = FALSE, notchwidth = 0.5, varwidth = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE)
stat_boxplot(mapping = NULL, data = NULL, geom = "boxplot", position = "dodge", ..., coef = 1.5, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE)
If NULL
, the default, the data is inherited from the plot
data as specified in the call to ggplot
.
A data.frame
, or other object, will override the plot
data. All objects will be fortified to produce a data frame. See
fortify
for which variables will be created.
A function
will be called with a single argument,
the plot data. The return value must be a data.frame.
, and
will be used as the layer data.
layer
. These are
often aesthetics, used to set an aesthetic to a fixed value, like
color = "red"
or size = 3
. They may also be parameters
to the paired geom/stat.NULL
to inherit from the
aesthetics used for the box.In the unlikely event you specify both US and UK spellings of colour, the US spelling will take precedence.
FALSE
(default) make a standard box plot. If
TRUE
, make a notched box plot. Notches are used to compare groups;
if the notches of two boxes do not overlap, this suggests that the medians
are significantly different.FALSE
(default) make a standard box plot. If
TRUE
, boxes are drawn with widths proportional to the
square-roots of the number of observations in the groups (possibly
weighted, using the weight
aesthetic).FALSE
(the default), removes missing values with
a warning. If TRUE
silently removes missing values.NA
, the default, includes if any aesthetics are mapped.
FALSE
never includes, and TRUE
always includes.FALSE
, overrides the default aesthetics,
rather than combining with them. This is most useful for helper functions
that define both data and aesthetics and shouldn't inherit behaviour from
the default plot specification, e.g. borders
.geom_boxplot
and stat_boxplot
.geom_boxplot
understands the following aesthetics (required aesthetics are in bold): lower
middle
upper
x
ymax
ymin
alpha
colour
fill
linetype
shape
size
weight
In a notched box plot, the notches extend 1.58 * IQR / sqrt(n)
.
This gives a roughly 95
See McGill et al. (1978) for more details.
stat_quantile
to view quantiles conditioned on a
continuous variable, geom_jitter
for another way to look
at conditional distributions.
p <- ggplot(mpg, aes(class, hwy)) p + geom_boxplot() p + geom_boxplot() + geom_jitter(width = 0.2) p + geom_boxplot() + coord_flip() p + geom_boxplot(notch = TRUE) p + geom_boxplot(varwidth = TRUE) p + geom_boxplot(fill = "white", colour = "#3366FF") # By default, outlier points match the colour of the box. Use # outlier.colour to override p + geom_boxplot(outlier.colour = "red", outlier.shape = 1) # Boxplots are automatically dodged when any aesthetic is a factor p + geom_boxplot(aes(colour = drv)) # You can also use boxplots with continuous x, as long as you supply # a grouping variable. cut_width is particularly useful ggplot(diamonds, aes(carat, price)) + geom_boxplot() ggplot(diamonds, aes(carat, price)) + geom_boxplot(aes(group = cut_width(carat, 0.25))) # It's possible to draw a boxplot with your own computations if you # use stat = "identity": y <- rnorm(100) df <- data.frame( x = 1, y0 = min(y), y25 = quantile(y, 0.25), y50 = median(y), y75 = quantile(y, 0.75), y100 = max(y) ) ggplot(df, aes(x)) + geom_boxplot( aes(ymin = y0, lower = y25, middle = y50, upper = y75, ymax = y100), stat = "identity" )