dabest
prepares a
tidy dataset for analysis
using estimation statistics.
dabest(.data, x, y, idx, paired = FALSE, id.column = NULL)
A dabest
object with 8 elements.
data
The dataset passed to dabest
, stored here
as a tibble
.
x
and y
The columns in data
used to plot the x
and y axes, respectively, as supplied to dabest
. These are
quoted variables for
tidy evaluation during the
computation of effect sizes.
idx
The vector of control-test groupings. For each pair in
idx
, an effect size will be computed by downstream dabestr
functions used to compute effect sizes (such as
mean_diff()
.
is.paired
Whether or not the experiment consists of paired (aka repeated) observations.
id.column
If is.paired
is TRUE
, the column in
data
that indicates the pairing of observations.
.data.name
The variable name of the dataset passed to
dabest
.
.all.groups
All groups as indicated in the idx
argument.
A data.frame or tibble.
Columns in .data
.
A vector containing factors or strings in the x
columns.
These must be quoted (ie. surrounded by quotation marks). The first element
will be the control group, so all differences will be computed for every
other group and this first group.
Boolean, default FALSE. If TRUE, the two groups are treated as paired samples. The first group is treated as pre-intervention and the second group is considered post-intervention.
Default NULL. A column name indicating the identity of the
datapoint if the data is paired. This must be supplied if paired is
TRUE
.
Estimation statistics is a statistical framework that focuses on effect sizes and confidence intervals around them, rather than P values and associated dichotomous hypothesis testing.
dabest
() collates the data in preparation for the computation of
effect sizes. Bootstrap resampling is used to compute
non-parametric assumption-free confidence intervals. Visualization of the
effect sizes and their confidence intervals using estimation plots is then
performed with a specialized plotting function.
Effect size computation from the loaded data.
Generating estimation plots after effect size computation.
# Performing unpaired (two independent groups) analysis.
unpaired_mean_diff <- dabest(iris, Species, Petal.Width,
idx = c("setosa", "versicolor"),
paired = FALSE)
# Display the results in a user-friendly format.
unpaired_mean_diff
# Compute the mean difference.
mean_diff(unpaired_mean_diff)
# Plotting the mean differences.
mean_diff(unpaired_mean_diff) %>% plot()
# Performing paired analysis.
# First, we munge the `iris` dataset so we can perform a within-subject
# comparison of sepal length vs. sepal width.
new.iris <- iris
new.iris$ID <- 1: length(new.iris)
setosa.only <-
new.iris %>%
tidyr::gather(key = Metric, value = Value, -ID, -Species) %>%
dplyr::filter(Species %in% c("setosa"))
paired_mean_diff <- dabest(setosa.only, Metric, Value,
idx = c("Sepal.Length", "Sepal.Width"),
paired = TRUE, id.col = ID) %>%
mean_diff()
# Using pipes to munge your data and then passing to `dabest`.
# First, we generate some synthetic data.
set.seed(12345)
N <- 70
c <- rnorm(N, mean = 50, sd = 20)
t1 <- rnorm(N, mean = 200, sd = 20)
t2 <- rnorm(N, mean = 100, sd = 70)
long.data <- tibble::tibble(Control = c, Test1 = t1, Test2 = t2)
# Munge the data using `gather`, then pass it directly to `dabest`
meandiff <- long.data %>%
tidyr::gather(key = Group, value = Measurement) %>%
dabest(x = Group, y = Measurement,
idx = c("Control", "Test1", "Test2"),
paired = FALSE) %>%
mean_diff()
Run the code above in your browser using DataLab