A modification of the boxplot with information about the tails
geom_qqboxplot(
mapping = NULL,
data = NULL,
stat = "qqboxplot",
position = "dodge2",
...,
outlier.colour = NULL,
outlier.color = NULL,
outlier.fill = NULL,
outlier.shape = 19,
outlier.size = 1.5,
outlier.stroke = 0.5,
outlier.alpha = NULL,
notch = FALSE,
notchwidth = 0.5,
varwidth = FALSE,
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE
)Returns an object of class GeomQqboxplot, (inherits from Geom, ggproto),
that renders the data for the Q-Q boxplot.
Set of aesthetic mappings created by aes() or
aes_(). If specified and inherit.aes = TRUE (the
default), it is combined with the default mapping at the top level of the
plot. You must supply mapping if there is no plot mapping.
The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot
data as specified in the call to ggplot().
A data.frame, or other object, will override the plot
data. All objects will be fortified to produce a data frame. See
fortify() for which variables will be created.
A function will be called with a single argument,
the plot data. The return value must be a data.frame, and
will be used as the layer data. A function can be created
from a formula (e.g. ~ head(.x, 10)).
specifies the stat function to use
Position adjustment, either as a string, or the result of a call to a position adjustment function.
Other arguments passed on to layer(). These are
often aesthetics, used to set an aesthetic to a fixed value, like
colour = "red" or size = 3. They may also be parameters
to the paired geom/stat.
Default aesthetics for outliers. Set to NULL to inherit from the
aesthetics used for the box.
In the unlikely event you specify both US and UK spellings of colour, the US spelling will take precedence.
Sometimes it can be useful to hide the outliers, for example when overlaying
the raw data points on top of the boxplot. Hiding the outliers can be achieved
by setting outlier.shape = NA. Importantly, this does not remove the outliers,
it only hides them, so the range calculated for the y-axis will be the
same with outliers shown and outliers hidden.
If FALSE (default) make a standard box plot. If
TRUE, make a notched box plot. Notches are used to compare groups;
if the notches of two boxes do not overlap, this suggests that the medians
are significantly different.
For a notched box plot, width of the notch relative to
the body (defaults to notchwidth = 0.5).
If FALSE (default) make a standard box plot. If
TRUE, boxes are drawn with widths proportional to the
square-roots of the number of observations in the groups (possibly
weighted, using the weight aesthetic).
If FALSE, the default, missing values are removed with
a warning. If TRUE, missing values are silently removed.
logical. Should this layer be included in the legends?
NA, the default, includes if any aesthetics are mapped.
FALSE never includes, and TRUE always includes.
It can also be a named logical vector to finely select the aesthetics to
display.
If FALSE, overrides the default aesthetics,
rather than combining with them. This is most useful for helper functions
that define both data and aesthetics and shouldn't inherit behaviour from
the default plot specification, e.g. borders().
The Q-Q boxplot inherits its summary statistics from the boxplot. See
geom_boxplot() for details. The Q-Q boxplot differs from the boxplot
by using more informative whiskers than the regular boxplot.
The vertical position of the whiskers can be interpreted as it is in the boxplot, and the maximal vertical value is chosen as it is done in the regular boxplot. The horizontal positioning of the whiskers indicates the deviation of the data set of interest from some reference data set (specified as either a theoretical distribution or an actual data set). Taking the central vertical axis of the boxplot as being zero, deviations to the right indicate that those values are larger than the corresponding data points in the reference data set, where two data points correspond if their quantiles match. Deviations to the left indicate that the values are smaller than their corresponding data points. Consider a situation where your data set has fatter tails than the normal distribution. When the reference distribution is the normal distribution, then the whiskers below the box will be left of the central axis (the left tail values are smaller than they ought to be) and the whiskers above the box will be right of the central axis (the right tail values are larger than the ought to be).
In order to compare the data set of interest to the reference data set, they must be on the same scale. The Q-Q boxplot uses Tukey's g-h distribution to determine the appropriate scaling factor.
Much of the code here is a modification of the geom_boxplot() code.
p <- ggplot2::ggplot(simulated_data, ggplot2::aes(factor(group,
levels=c("normal, mean=2", "t distribution, df=32", "t distribution, df=16",
"t distribution, df=8", "t distribution, df=4")), y=y))
p + geom_qqboxplot()
p + geom_qqboxplot(reference_dist = "norm")
# \donttest{
p + geom_qqboxplot(compdata = comparison_dataset)
# }
# geom_qqboxplot inherits all arguments from geom_boxplot, e.g.:
p + geom_qqboxplot(notch = TRUE)
p + geom_qqboxplot(varwidth=TRUE)
p + geom_qqboxplot(ggplot2::aes(color = group)) + ggplot2::guides(color=FALSE)
Run the code above in your browser using DataLab