Compares groups by (1) creating histogram by group; (2) summarizing descriptive statistics by group; and (3) conducting pairwise comparisons (t-tests and Mann-Whitney tests).
compare_groups(
data = NULL,
iv_name = NULL,
dv_name = NULL,
sigfigs = 3,
stats = "basic",
welch = TRUE,
cohen_d = TRUE,
cohen_d_w_ci = TRUE,
adjust_p = "holm",
bonferroni = NULL,
mann_whitney = TRUE,
t_test_stats = TRUE,
round_p = 3,
anova = FALSE,
round_f = 2,
round_t = 2,
round_t_test_df = 2,
save_as_png = FALSE,
png_name = NULL,
xlab = NULL,
ylab = NULL,
x_limits = NULL,
x_breaks = NULL,
x_labels = NULL,
width = 5000,
height = 3600,
units = "px",
res = 300,
layout_matrix = NULL,
col_names_nicer = TRUE,
convert_dv_to_numeric = TRUE
)
the output will be a list of (1) ggplot object
(histogram by group) (2) a data.table with descriptive statistics by
group; and (3) a data.table with pairwise comparison results.
If save_as_png = TRUE
, the plot and tables will be also saved
on local drive as a PNG file.
a data object (a data frame or a data.table)
name of the independent variable (grouping variable)
name of the dependent variable (measure variable of interest)
number of significant digits to round to
statistics to calculate for each group.
If stats = "basic"
,
group size, mean, standard deviation, median, minimum, and maximum will
be calculated. If stats = "all"
, in addition to the
aforementioned statistics, standard error, 95% confidence and
prediction intervals, skewness, and kurtosis will also be calculated.
The stats
argument can also be a character vector with types of
statistics to calculate. For example, entering
stats = c("mean", "median")
will calculate mean and median.
By default, stats = "basic"
Should Welch's t-tests be conducted?
By default, welch = TRUE
if cohen_d = TRUE
, Cohen's d statistics will be
included in the pairwise comparison data.table.
if cohen_d_w_ci = TRUE
,
Cohen's d with 95% CI will be included in the output data.table.
the name of the method to use to adjust p-values.
If adjust_p = "holm"
, the Holm method will be used;
if adjust_p = "bonferroni"
, the Bonferroni method will be used.
By default, adjust_p = "holm"
The use of this argument is deprecated.
Use the 'adjust_p' argument instead.
If bonferroni = TRUE
, Bonferroni tests will be
conducted for t-tests or Mann-Whitney tests.
if TRUE
, Mann-Whitney test results will be
included in the pairwise comparison data.table.
If FALSE
, Mann-Whitney tests will not be performed.
if t_test_stats = FALSE
, t-test statistic
and degrees of freedom will be excluded in the pairwise comparison
data.table. (default = TRUE)
number of decimal places to which to round p-values (default = 3)
Should a one-way ANOVA be conducted and reported?
By default, anova = FALSE
, but when there are more than two
levels in the independent variable, the value will change such tat
anova = TRUE
.
number of decimal places to which to round the f statistic (default = 2)
number of decimal places to which to round the t statistic (default = 2)
number of decimal places to which to round the degrees of freedom for t tests (default = 2)
if save_as_png = "all"
or
if save_as_png = TRUE
,
the histogram by group, descriptive statistics by group,
and pairwise comparison results will be saved as a PNG file.
name of the PNG file to be saved. By default, the name will be "compare_groups_results_" followed by a timestamp of the current time. The timestamp will be in the format, jan_01_2021_1300_10_000001, where "jan_01_2021" would indicate January 01, 2021; 1300 would indicate 13:00 (i.e., 1 PM); and 10_000001 would indicate 10.000001 seconds after the hour.
title of the x-axis for the histogram by group.
If xlab = FALSE
, the title will be removed. By default
(i.e., if no input is given), dv_name
will be used as
the title.
title of the y-axis for the histogram by group.
If ylab = FALSE
, the title will be removed. By default
(i.e., if no input is given), iv_name
will be used as
the title.
a numeric vector with values of the endpoints of the x axis.
a numeric vector indicating the points at which to place tick marks on the x axis.
a vector containing labels for the place tick marks on the x axis.
width of the PNG file (default = 5000)
height of the PNG file (default = 3600)
the units for the width
and height
arguments.
Can be "px"
(pixels), "in"
(inches), "cm"
,
or "mm"
. By default, units = "px"
.
The nominal resolution in ppi which will be recorded
in the png file, if a positive integer. Used for units
other than the default. By default, res = 300
The layout argument for arranging plots and tables
using the grid.arrange
function.
if col_names_nicer = TRUE
, column names
will be converted from snake_case to an easier-to-eye format.
logical. Should the values in the dependent variable be converted to numeric for plotting the histograms? (default = TRUE)
if holm = TRUE
, the relevant p values will be
adjusted using Holm method (also known as the Holm-Bonferroni or
Bonferroni-Holm method)
if (FALSE) {
compare_groups(data = iris, iv_name = "Species", dv_name = "Sepal.Length")
compare_groups(data = iris, iv_name = "Species", dv_name = "Sepal.Length",
x_breaks = 4:8)
# Welch's t-test
compare_groups(
data = mtcars, iv_name = "am", dv_name = "hp")
# A Student's t-test
compare_groups(
data = mtcars, iv_name = "am", dv_name = "hp", welch = FALSE)
}
Run the code above in your browser using DataLab