Learn R Programming

eVCGsampler R package

eVCGsampler provides a principled framework for sampling of VCG (Virtual Control Group), using energy distance-based covariate balancing. The package includes visualization tools for assessing covariate balance, as well as a permutation test to evaluate the statistical significance of the deviations.

Example.

Test for 3 covariates before balancing, comparison of the treated groups (TG) with the data pool (POOL), shows high imbalance.

Distance permutation test TG vs POOL:

By running the function: VCG_sampler(treated ~ cov1 + cov2 + cov3, data=dat, n=10)

Distance permutation test TG vs VCG:

Plot specifically for the variable cov3: plot_var(dat_out, what='cov3’)

Best VCG size (exploratory)

With BestVCGsize(treat ~ cov1 + cov2 + cov3, data=dat), you can explore the best size for VCG with the best balance of covariates. It may not necessarily be the best size in terms of power or validity of the study.

Multiple VCG samples

If multiple VCG samples are required, use: multiSampler(treat~cov1+cov2+cov3, n=10, Nsamples=10, data=dat)

Overview of sample overlapping:

Copy Link

Version

Install

install.packages('eVCGsampler')

Version

0.9.5

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Andreas Schulz

Last Published

February 13th, 2026

Functions in eVCGsampler (0.9.5)

find_outliers

Find Outlier Groups Based on Energy Distance
combine_data

Combine data from pool and treated groups
VCG_sampler

VCG Sampler for Energy Distance Balancing
robust_scale

Robust Scaling of Numeric and Categorical Variables
energy_test

Permutation Energy Test for Covariate Imbalance
multiSampler

Multi-Sample VCG Generator and Overlap Visualization
BestVCGsize

The function attempts to find the optimal size for VCG.
plot_var

Visualize Covariate Distribution Across TG, VCG, and POOL
energy_distance

Compute Energy Distance Between Two Groups
combine_variables

Compute Weighted Combined Score from Multiple ariables