dabest: Prepare Data for Analysis with dabestr

Description

dabest prepares a tidy dataset for analysis using estimation statistics.

Usage

dabest(.data, x, y, idx, paired = FALSE, id.column = NULL)

Value

A dabest object with 8 elements.

data: The dataset passed to dabest, stored here as a tibble.
x and y: The columns in data used to plot the x and y axes, respectively, as supplied to dabest. These are quoted variables for tidy evaluation during the computation of effect sizes.
idx: The vector of control-test groupings. For each pair in idx, an effect size will be computed by downstream dabestr functions used to compute effect sizes (such as mean_diff().
is.paired: Whether or not the experiment consists of paired (aka repeated) observations.
id.column: If is.paired is TRUE, the column in data that indicates the pairing of observations.
.data.name: The variable name of the dataset passed to dabest.
.all.groups: All groups as indicated in the idx argument.

Arguments

.data: A data.frame or tibble.
x, y: Columns in .data.
idx: A vector containing factors or strings in the x columns. These must be quoted (ie. surrounded by quotation marks). The first element will be the control group, so all differences will be computed for every other group and this first group.
paired: Boolean, default FALSE. If TRUE, the two groups are treated as paired samples. The first group is treated as pre-intervention and the second group is considered post-intervention.
id.column: Default NULL. A column name indicating the identity of the datapoint if the data is paired. This must be supplied if paired is TRUE.

Details

Estimation statistics is a statistical framework that focuses on effect sizes and confidence intervals around them, rather than P values and associated dichotomous hypothesis testing.

dabest() collates the data in preparation for the computation of effect sizes. Bootstrap resampling is used to compute non-parametric assumption-free confidence intervals. Visualization of the effect sizes and their confidence intervals using estimation plots is then performed with a specialized plotting function.

Examples

Run this code

# Performing unpaired (two independent groups) analysis.
unpaired_mean_diff <- dabest(iris, Species, Petal.Width,
                             idx = c("setosa", "versicolor"),
                             paired = FALSE)

# Display the results in a user-friendly format.
unpaired_mean_diff

# Compute the mean difference.
mean_diff(unpaired_mean_diff)

# Plotting the mean differences.
mean_diff(unpaired_mean_diff) %>% plot()

# Performing paired analysis.
# First, we munge the `iris` dataset so we can perform a within-subject
# comparison of sepal length vs. sepal width.

new.iris     <- iris
new.iris$ID  <- 1: length(new.iris)
setosa.only  <-
  new.iris %>%
  tidyr::gather(key = Metric, value = Value, -ID, -Species) %>%
  dplyr::filter(Species %in% c("setosa"))

paired_mean_diff <- dabest(setosa.only, Metric, Value,
                           idx = c("Sepal.Length", "Sepal.Width"),
                           paired = TRUE, id.col = ID) %>%
                    mean_diff()



# Using pipes to munge your data and then passing to `dabest`.
# First, we generate some synthetic data.
set.seed(12345)
N        <- 70
c         <- rnorm(N, mean = 50, sd = 20)
t1        <- rnorm(N, mean = 200, sd = 20)
t2        <- rnorm(N, mean = 100, sd = 70)
long.data <- tibble::tibble(Control = c, Test1 = t1, Test2 = t2)

# Munge the data using `gather`, then pass it directly to `dabest`

meandiff <- long.data %>%
              tidyr::gather(key = Group, value = Measurement) %>%
              dabest(x = Group, y = Measurement,
                     idx = c("Control", "Test1", "Test2"),
                     paired = FALSE) %>%
              mean_diff()

Run the code above in your browser using DataLab

Get 50% off unlimited learning