compare_means: Comparison of Means

Description

Performs one or multiple mean comparisons.

Usage

compare_means(
  formula,
  data,
  method = "wilcox.test",
  paired = FALSE,
  group.by = NULL,
  ref.group = NULL,
  symnum.args = list(),
  p.adjust.method = "holm",
  ...
)

Value

return a data frame with the following columns:

.y.: the y variable used in the test.
group1,group2: the compared groups in the pairwise tests. Available only when method = "t.test" or method = "wilcox.test".
p: the p-value.
p.adj: the adjusted p-value. Default for p.adjust.method = "holm".
p.format: the formatted p-value.
p.signif: the significance level.
method: the statistical test used to compare groups.

Arguments

formula

a formula of the form x ~ group where x is a numeric variable giving the data values and group is a factor with one or multiple levels giving the corresponding groups. For example, formula = TP53 ~ cancer_group.

It's also possible to perform the test for multiple response variables at the same time. For example, formula = c(TP53, PTEN) ~ cancer_group.

data

a data.frame containing the variables in the formula.

method

the type of test. Default is wilcox.test. Allowed values include:

t.test (parametric) and wilcox.test (non-parametric). Perform comparison between two groups of samples. If the grouping variable contains more than two levels, then a pairwise comparison is performed.
anova (parametric) and kruskal.test (non-parametric). Perform one-way ANOVA test comparing multiple groups.

paired

a logical indicating whether you want a paired test. Used only in t.test and in wilcox.test.

group.by

a character vector containing the name of grouping variables.

ref.group

a character string specifying the reference group. If specified, for a given grouping variable, each of the group levels will be compared to the reference group (i.e. control group).

ref.group can be also ".all.". In this case, each of the grouping variable levels is compared to all (i.e. basemean).

symnum.args

a list of arguments to pass to the function symnum for symbolic number coding of p-values. For example, symnum.args <- list(cutpoints = c(0, 0.0001, 0.001, 0.01, 0.05, 1), symbols = c("****", "***", "**", "*", "ns")).

In other words, we use the following convention for symbols indicating statistical significance:

ns: p > 0.05
*: p <= 0.05
**: p <= 0.01
***: p <= 0.001
****: p <= 0.0001

p.adjust.method

method for adjusting p values (see p.adjust). Has impact only in a situation, where multiple pairwise tests are performed; or when there are multiple grouping variables. Allowed values include "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none". If you don't want to adjust the p value (not recommended), use p.adjust.method = "none".

Note that, when the formula contains multiple variables, the p-value adjustment is done independently for each variable.

...

Other arguments to be passed to the test function.

Examples

Run this code

# Load data
#:::::::::::::::::::::::::::::::::::::::
data("ToothGrowth")
df <- ToothGrowth

# One-sample test
#:::::::::::::::::::::::::::::::::::::::::
compare_means(len ~ 1, df, mu = 0)

# Two-samples unpaired test
#:::::::::::::::::::::::::::::::::::::::::
compare_means(len ~ supp, df)

# Two-samples paired test
#:::::::::::::::::::::::::::::::::::::::::
compare_means(len ~ supp, df, paired = TRUE)

# Compare supp levels after grouping the data by "dose"
#::::::::::::::::::::::::::::::::::::::::
compare_means(len ~ supp, df, group.by = "dose")

# pairwise comparisons
#::::::::::::::::::::::::::::::::::::::::
# As dose contains more thant two levels ==>
# pairwise test is automatically performed.
compare_means(len ~ dose, df)

# Comparison against reference group
#::::::::::::::::::::::::::::::::::::::::
compare_means(len ~ dose, df, ref.group = "0.5")

# Comparison against all
#::::::::::::::::::::::::::::::::::::::::
compare_means(len ~ dose, df, ref.group = ".all.")

# Anova and kruskal.test
#::::::::::::::::::::::::::::::::::::::::
compare_means(len ~ dose, df, method = "anova")
compare_means(len ~ dose, df, method = "kruskal.test")

Run the code above in your browser using DataLab