Learn R Programming

eVCGsampler (version 0.9.2)

energy_test: Permutation Energy Test for Covariate Imbalance

Description

Performs a permutation-based energy distance test to assess whether two groups (defined by a binary treated variable) are balanced across a set of covariates. Optionally, it visualizes the distribution of permuted energy distances and highlights the observed test statistic and critical value.

Usage

energy_test(formula, data, alpha = 0.05, R = 2000, plot = TRUE)

Value

If `plot = TRUE`, returns a list with:

  • A list of class `"htest"` containing:

    • `p.value`: The permutation p-value.

    • `estimate`: The observed energy distance.

    • `critical.value`: The critical value at the specified alpha level.

    • `alternative`: The alternative hypothesis ("one.sided").

    • `method`: Description of the test.

    • `n.permutations`: Number of permutations performed.

    • `permutations`: Vector of permuted energy distances.

  • A ggplot2 object showing the histogram of permuted distances, with vertical lines for the observed statistic and critical value.

If `plot = FALSE`, returns only the `"htest"` result list.

Arguments

formula

A formula specifying the treated and covariates, e.g., `treated ~ cov1 + cov2 | stratum`.

data

A data frame containing the variables specified in the formula.

alpha

Significance level for the test (default is 0.05).

R

Number of permutations to perform (default is 2000).

plot

Logical. If `TRUE`, returns a ggplot2 visualization of the permutation distribution.

Details

The energy distance is a non-parametric measure of distributional difference. This test evaluates whether the covariate distributions between two groups are statistically distinguishable. A small p-value indicates imbalance between groups. A one-sided test is used because the energy distance is strictly positive; only values greater than the observed statistic in the permutation distribution are relevant.

See Also

element

Examples

Run this code

dat <- data.frame(
 treated = rep(0:1, c(50, 30)),
 age    = c(rnorm(50, 5, 2),   rnorm(30, 5, 1)),
 weight = c(rnorm(50, 11, 2),  rnorm(30, 10, 1)),
 class  = c(rbinom(50, 3, 0.6),   rbinom(30, 3, 0.4))
 )

 energy_test(treated ~ age + weight + class, data=dat, R = 500)

Run the code above in your browser using DataLab