Learn R Programming

⚠️There's a newer version (0.3.1) of this package.Take me there.

esvis

R Package for effect size visualizations.

This package is designed to visually compare two or more distributions across the entirety of the scale, rather than only by measures of central tendency (e.g., means). There are also some functions for estimating effect size, including Cohen's d, Hedges' g, percentage above a cut, transformed (normalized) percentage above a cut, the area under the curve (conceptually equivalent to the probability that a randomly selected individual from Distribution A has a higher value than a randomly selected individual from Distribution B), and the V statistic, which essentially transforms the area under the curve to standard deviation units (see Ho, 2009).

Installation

Install directly from CRAN with

install.packages("esvis")

Or the development version from from github with:

# install.packages("devtools")
devtools::install_github("DJAnderson07/esvis")

Plotting methods

There are three primary data visualizations: (a) binned effect size plots, (b)probability-probability plots, and (c) empirical cumulative distribution functions. All plots should be fully manipulable with calls to the base plotting functions.

At present, the binned effect size plot can only be produced with Cohen's d, although future development will allow the user to select the type of effect size. The binned effect size plot splits the distribution into quantiles specified by the user (defaults to lower, middle, and upper thirds), calculates the mean difference between groups within each quantile bin, and produces an effect size for each bin by dividing by the overall pooled standard deviation (i.e., not by quantile). For example

library(esvis)
binned_plot(math ~ ell, benchmarks)

Note that in this plot one can clearly see that the magnitude of the differences between the two three groups depends upon scale location (i.e., low achieving students versus average or high achieving students). Both the reference group and the quantiles used can be changed. For example binned_plot(math ~ ell, benchmarks, ref_group = "Non-ELL", qtiles = seq(0, 1, .2)) would produce the same plot but binned by quintiles, with students who did not receive English language services (Non-ELL) as the reference group.

A probability-probability plot can be produced with a call to pp_plot and an equivalent argument structure. In this case, we're visualizing the difference in reading achievement by race/ethnicity. By default, the distribution with the highest mean serves as the reference group, in this case students identifying as White.

pp_plot(reading ~ ethnicity, benchmarks)

If the grouping factor has only two levels, the area under the PP curve will be shaded, with the AUC an V statistics annotated onto the plot.

pp_plot(reading ~ frl, benchmarks)

The shading and annotations are optional and can be removed. The colors and all other plot features are also fully customizable.

Finally, the ecdf_plot function essentially dresses up the base plot.ecdf function, but also adds some nice referencing features through additional, optional arguments. Below, I have included the optional hor_ref = TRUE argument such that horizontal reference lines appear, relative to the cuts provided.

ecdf_plot(math ~ season, benchmarks, 
    ref_cut = c(190, 200, 215), 
    hor_ref = TRUE)

Estimation Methods

Compute effect sizes for all possible pairwise comparisons.

coh_d(mean ~ subject, seda)
#>   ref_group foc_group   estimate
#> 1      math       ela  0.8312519
#> 2       ela      math -0.8312519

Or specify a reference group

coh_d(mean ~ grade, seda, ref_group = 8)
#>   ref_group foc_group estimate
#> 1         8         7 0.593485
#> 2         8         6 1.165106
#> 3         8         5 1.819459
#> 4         8         4 2.416754
#> 5         8         3 3.004039

Other effect sizes are estimated equivalently. For example, compute V (Ho, 2009) with

v(mean ~ grade, seda, ref_group = 8)
#>   ref_group foc_group estimate
#> 1         8         7 0.605855
#> 2         8         6 1.202515
#> 3         8         5 1.912094
#> 4         8         4 2.577780
#> 5         8         3 3.225021

or AUC with

auc(mean ~ grade, seda, ref_group = 8)
#>   ref_group foc_group  estimate
#> 1         8         7 0.6658216
#> 2         8         6 0.8024226
#> 3         8         5 0.9118211
#> 4         8         4 0.9658305
#> 5         8         3 0.9887090

Copy Link

Version

Install

install.packages('esvis')

Monthly Downloads

238

Version

0.2.0

License

MIT + file LICENSE

Issues

Pull Requests

Stars

Forks

Maintainer

Daniel Anderson

Last Published

April 9th, 2018

Functions in esvis (0.2.0)

create_cut_refs

Create a set of reference lines according to a cut score
qtile_es

Compute effect sizes by quantile bins
coh_d

Compute Cohen's d
col_hue

Color hues
create_vec

Create a named vector of all possible combinations
col_scheme

Determine the color scheme to be used for the plotting
pp_calcs

create_base_legend

Create a base legend for a plot
pp_plot

Produces the paired probability plot for two groups
qtile_mean_diffs

Compute mean differences by various quantiles
qtile_n

Compute sample size for each quantile bin for each group
ecdf_plot

Empirical Cumulative Distribution Plot
binned_plot

Quantile-binned effect size plot
cdfs

Compute the empirical distribution functions for each of several groups.
pac

Compute the proportion above a specific cut location
seda

Portion of the Stanford Educational Data Archive (SEDA).
parse_form

Parse formula
seg_match

Match segments on a plot
tpac

Transformed proportion above the cut
v

Calculate the V effect size statistic
auc

Calculate the area under the curve
pooled_sd

Compute pooled standard deviation
benchmarks

Synthetic benchmark screening data
pp_annotate

Annotation function to add AUC/V to a given plot
star

Data from the Tennessee class size experiment
themes

Theme settings
create_legend

Create a legend for a plot
empty_plot

Create an empty plot
hedg_g

Compute Hedges' g This function calculates effect sizes in terms of Hedges' g, also called the corrected (for sample size) effect size. See coh_d for the uncorrected version. Also see Lakens (2013) for a discussion on different types of effect sizes and their interpretation. Note that missing data are removed from the calculations of the means and standard deviations.
probs

Compute probabilities from the empirical CDFs of a grouping variable for each group.