grouped_ggbetweenstats
Violin plots for group or condition comparisons in between-subjects designs repeated across all levels of a grouping variable.
A combined plot of comparison plot created for levels of a grouping variable.
Usage
grouped_ggbetweenstats(data, x, y, grouping.var, title.prefix = NULL,
plot.type = "boxviolin", type = "parametric",
pairwise.comparisons = FALSE, pairwise.annotation = "asterisk",
pairwise.display = "significant", p.adjust.method = "holm",
effsize.type = "unbiased", partial = TRUE,
effsize.noncentral = TRUE, bf.prior = 0.707, bf.message = TRUE,
results.subtitle = TRUE, xlab = NULL, ylab = NULL,
subtitle = NULL, stat.title = NULL, caption = NULL,
sample.size.label = TRUE, k = 2, var.equal = FALSE,
conf.level = 0.95, nboot = 100, tr = 0.1, sort = "none",
sort.fun = mean, axes.range.restrict = FALSE, mean.label.size = 3,
mean.label.fontface = "bold", mean.label.color = "black",
notch = FALSE, notchwidth = 0.5, linetype = "solid",
outlier.tagging = FALSE, outlier.label = NULL,
outlier.label.color = "black", outlier.color = "black",
outlier.shape = 19, outlier.coef = 1.5, mean.plotting = TRUE,
mean.ci = FALSE, mean.size = 5, mean.color = "darkred",
point.jitter.width = NULL, point.jitter.height = 0,
point.dodge.width = 0.6, ggtheme = ggplot2::theme_bw(),
ggstatsplot.layer = TRUE, package = "RColorBrewer",
palette = "Dark2", direction = 1, ggplot.component = NULL,
return = "plot", messages = TRUE, ...)
Arguments
- data
A dataframe (or a tibble) from which variables specified are to be taken. A matrix or tables will not be accepted.
- x
The grouping variable from the dataframe
data
.- y
The response (a.k.a. outcome or dependent) variable from the dataframe
data
.- grouping.var
A single grouping variable (can be entered either as a bare name
x
or as a string"x"
).- title.prefix
Character string specifying the prefix text for the fixed plot title (name of each factor level) (Default:
NULL
). IfNULL
, the variable name entered forgrouping.var
will be used.- plot.type
Character describing the type of plot. Currently supported plots are
"box"
(for pure boxplots),"violin"
(for pure violin plots), and"boxviolin"
(for a combination of box and violin plots; default).- type
Type of statistic expected (
"parametric"
or"nonparametric"
or"robust"
or"bayes"
).Corresponding abbreviations are also accepted:"p"
(for parametric),"np"
(nonparametric),"r"
(robust), or"bf"
resp.- pairwise.comparisons
Logical that decides whether pairwise comparisons are to be displayed. Only significant comparisons will be shown by default. (default:
FALSE
). To change this behavior, select appropriate option withpairwise.display
argument.- pairwise.annotation
Character that decides the annotations to use for pairwise comparisons. Either
"p.value"
or"asterisk"
(default).- pairwise.display
Decides which pairwise comparisons to display. Available options are
"significant"
(abbreviation accepted:"s"
) or"non-significant"
(abbreviation accepted:"ns"
) or"everything"
/"all"
. The default is"significant"
. You can use this argument to make sure that your plot is not uber-cluttered when you have multiple groups being compared and scores of pairwise comparisons being displayed.- p.adjust.method
Adjustment method for p-values for multiple comparisons. Possible methods are:
"holm"
(default),"hochberg"
,"hommel"
,"bonferroni"
,"BH"
,"BY"
,"fdr"
,"none"
.- effsize.type
Type of effect size needed for parametric tests. The argument can be
"biased"
("d"
for Cohen's d for t-test;"partial_eta"
for partial eta-squared for anova) or"unbiased"
("g"
Hedge's g for t-test;"partial_omega"
for partial omega-squared for anova)).- partial
Logical that decides if partial eta-squared or omega-squared are returned (Default:
TRUE
). IfFALSE
, eta-squared or omega-squared will be returned. Valid only for objects of classlm
,aov
,anova
, oraovlist
.- effsize.noncentral
Logical indicating whether to use non-central t-distributions for computing the confidence interval for Cohen's d or Hedge's g (Default:
TRUE
).- bf.prior
A number between
0.5
and2
(default0.707
), the prior width to use in calculating Bayes factors.- bf.message
Logical that decides whether to display Bayes Factor in favor of the null hypothesis. This argument is relevant only for parametric test (Default:
TRUE
).- results.subtitle
Decides whether the results of statistical tests are to be displayed as a subtitle (Default:
TRUE
). If set toFALSE
, only the plot will be returned.- xlab
Labels for
x
andy
axis variables. IfNULL
(default), variable names forx
andy
will be used.- ylab
Labels for
x
andy
axis variables. IfNULL
(default), variable names forx
andy
will be used.- subtitle
The text for the plot subtitle. Will work only if
results.subtitle = FALSE
.- stat.title
A character describing the test being run, which will be added as a prefix in the subtitle. The default is
NULL
. An example of astat.title
argument will be something like"Student's t-test: "
.- caption
The text for the plot caption.
- sample.size.label
Logical that decides whether sample size information should be displayed for each level of the grouping variable
x
(Default:TRUE
).- k
Number of digits after decimal point (should be an integer) (Default:
k = 2
).- var.equal
a logical variable indicating whether to treat the variances in the samples as equal. If
TRUE
, then a simple F test for the equality of means in a one-way analysis of variance is performed. IfFALSE
, an approximate method of Welch (1951) is used, which generalizes the commonly known 2-sample Welch test to the case of arbitrarily many samples.- conf.level
Scalar between 0 and 1. If unspecified, the defaults return
95%
lower and upper confidence intervals (0.95
).- nboot
Number of bootstrap samples for computing confidence interval for the effect size (Default:
100
).- tr
Trim level for the mean when carrying out
robust
tests. If you get error stating "Standard error cannot be computed because of Winsorized variance of 0 (e.g., due to ties). Try to decrease the trimming level.", try to play around with the value oftr
, which is by default set to0.1
. Lowering the value might help.- sort
If
"ascending"
(default),x
-axis variable factor levels will be sorted based on increasing values ofy
-axis variable. If"descending"
, the opposite. If"none"
, no sorting will happen.- sort.fun
The function used to sort (default:
mean
).- axes.range.restrict
Logical that decides whether to restrict the axes values ranges to
min
andmax
values of the axes variables (Default:FALSE
), only relevant for functions where axes variables are of numeric type.- mean.label.size
Aesthetics for the label displaying mean. Defaults:
3
,"bold"
,"black"
, respectively.- mean.label.fontface
Aesthetics for the label displaying mean. Defaults:
3
,"bold"
,"black"
, respectively.- mean.label.color
Aesthetics for the label displaying mean. Defaults:
3
,"bold"
,"black"
, respectively.- notch
A logical. If
FALSE
(default), a standard box plot will be displayed. IfTRUE
, a notched box plot will be used. Notches are used to compare groups; if the notches of two boxes do not overlap, this suggests that the medians are significantly different. In a notched box plot, the notches extend1.58 * IQR / sqrt(n)
. This gives a roughly95%
confidence interval for comparing medians. IQR: Inter-Quartile Range.- notchwidth
For a notched box plot, width of the notch relative to the body (default
0.5
).- linetype
Character strings (
"blank"
,"solid"
,"dashed"
,"dotted"
,"dotdash"
,"longdash"
, and"twodash"
) specifying the type of line to draw box plots (Default:"solid"
). Alternatively, the numbers0
to6
can be used (0
for "blank",1
for "solid", etc.).- outlier.tagging
Decides whether outliers should be tagged (Default:
FALSE
).- outlier.label
Label to put on the outliers that have been tagged.
- outlier.label.color
Color for the label to to put on the outliers that have been tagged (Default:
"black"
).- outlier.color
Default aesthetics for outliers (Default:
"black"
).- outlier.shape
Hiding the outliers can be achieved by setting outlier.shape = NA. Importantly, this does not remove the outliers, it only hides them, so the range calculated for the y-axis will be the same with outliers shown and outliers hidden.
- outlier.coef
Coefficient for outlier detection using Tukey's method. With Tukey's method, outliers are below (1st Quartile) or above (3rd Quartile)
outlier.coef
times the Inter-Quartile Range (IQR) (Default:1.5
).- mean.plotting
Logical that decides whether mean is to be highlighted and its value to be displayed (Default:
TRUE
).- mean.ci
Logical that decides whether 95 is to be displayed (Default:
FALSE
).- mean.size
Point size for the data point corresponding to mean (Default:
5
).- mean.color
Color for the data point corresponding to mean (Default:
"darkred"
).- point.jitter.width
Numeric specifying the degree of jitter in
x
direction. Defaults to40%
of the resolution of the data.- point.jitter.height
Numeric specifying the degree of jitter in
y
direction. Defaults to0.1
.- point.dodge.width
Numeric specifying the amount to dodge in the
x
direction. Defaults to0.60
.- ggtheme
A function,
ggplot2
theme name. Default value isggplot2::theme_bw()
. Any of theggplot2
themes, or themes from extension packages are allowed (e.g.,ggthemes::theme_fivethirtyeight()
,hrbrthemes::theme_ipsum_ps()
, etc.).- ggstatsplot.layer
Logical that decides whether
theme_ggstatsplot
theme elements are to be displayed along with the selectedggtheme
(Default:TRUE
).- package
Name of package from which the palette is desired as string or symbol.
- palette
If a character string (e.g.,
"Set1"
), will use that named palette. If a number, will index into the list of palettes of appropriate type. Default palette is"Dark2"
.- direction
Either
1
or-1
. If-1
the palette will be reversed.- ggplot.component
A
ggplot
component to be added to the plot prepared byggstatsplot
. This argument is primarily helpful forgrouped_
variant of the current function. Default isNULL
. The argument should be entered as a function. If the given function has an argumentaxes.range.restrict
and if it has been set toTRUE
, the added ggplot component might not work as expected.- return
Character that describes what is to be returned: can be
"plot"
(default) or"subtitle"
or"caption"
. Setting this to"subtitle"
will return the expression containing statistical results, which will be aNULL
if you setresults.subtitle = FALSE
. Setting this to"caption"
will return the expression containing details about Bayes Factor analysis, but valid only whentype = "p"
andbf.message = TRUE
, otherwise this will return aNULL
.- messages
Decides whether messages references, notes, and warnings are to be displayed (Default:
TRUE
).- ...
Arguments passed on to
combine_plots
- title.text
String or plotmath expression to be drawn as title for the combined plot.
- title.color
Text color for title.
- title.size
Point size of title text.
- title.vjust
Vertical justification for title. Default =
0.5
(centered ony
).0
= baseline aty
,1
= ascender aty
.- title.hjust
Horizontal justification for title. Default =
0.5
(centered onx
).0
= flush-left at x,1
= flush-right.- title.fontface
The font face (
"plain"
,"bold"
(default),"italic"
,"bold.italic"
) for title.- caption.text
String or plotmath expression to be drawn as the caption for the combined plot.
- caption.color
Text color for caption.
- caption.size
Point size of title text.
- caption.vjust
Vertical justification for caption. Default =
0.5
(centered on y).0
= baseline at y,1
= ascender at y.- caption.hjust
Horizontal justification for caption. Default =
0.5
(centered on x).0
= flush-left at x,1
= flush-right.- caption.fontface
The font face (
"plain"
(default),"bold"
,"italic"
,"bold.italic"
) for caption.- sub.text
The label with which the combined plot should be annotated. Can be a plotmath expression.
- sub.color
Text color for annotation label (Default:
"black"
).- sub.size
Point size of annotation text (Default:
12
).- sub.x
The x position of annotation label (Default:
0.5
).- sub.y
The y position of annotation label (Default:
0.5
).- sub.hjust
Horizontal justification for annotation label (Default:
0.5
).- sub.vjust
Vertical justification for annotation label (Default:
0.5
).- sub.vpadding
Vertical padding. The total vertical space added to the label, given in grid units. By default, this is added equally above and below the label. However, by changing the y and vjust parameters, this can be changed (Default:
grid::unit(1, "lines")
).- sub.fontface
The font face (
"plain"
(default),"bold"
,"italic"
,"bold.italic"
) for the annotation label.- sub.angle
Angle at which annotation label is to be drawn (Default:
0
).- sub.lineheight
Line height of annotation label.
- title.caption.rel.heights
Numerical vector of relative columns heights while combining (title, plot, caption).
- title.rel.heights
Numerical vector of relative columns heights while combining (title, plot).
- caption.rel.heights
Numerical vector of relative columns heights while combining (plot, caption).
Details
For parametric tests, Welch's ANOVA/t-test are used as a default (i.e.,
var.equal = FALSE
).
References:
ANOVA: Delacre, Leys, Mora, & Lakens, PsyArXiv, 2018
t-test: Delacre, Lakens, & Leys, International Review of Social Psychology, 2017
If robust tests are selected, following tests are used is .
ANOVA: one-way ANOVA on trimmed means (see
?WRS2::t1way
)t-test: Yuen's test for trimmed means (see
?WRS2::yuen
)
For more about how the effect size measures (for nonparametric tests) and
their confidence intervals are computed, see ?rcompanion::wilcoxonR
.
For repeated measures designs, use ggwithinstats
.
References
https://indrajeetpatil.github.io/ggstatsplot/articles/web_only/ggbetweenstats.html
See Also
Examples
# NOT RUN {
# to get reproducible results from bootstrapping
set.seed(123)
# the most basic function call
ggstatsplot::grouped_ggbetweenstats(
data = dplyr::filter(ggplot2::mpg, drv != "4"),
x = year,
y = hwy,
grouping.var = drv,
conf.level = 0.99
)
# }
# NOT RUN {
# modifying individual plots using `ggplot.component` argument
ggstatsplot::grouped_ggbetweenstats(
data = dplyr::filter(
ggstatsplot::movies_long,
genre %in% c("Action", "Comedy"),
mpaa %in% c("R", "PG")
),
x = genre,
y = rating,
grouping.var = mpaa,
results.subtitle = FALSE,
ggplot.component = ggplot2::scale_y_continuous(breaks = seq(1, 9, 1)),
messages = FALSE
)
# }
# NOT RUN {
# }