lsr (version 0.5)

# cohensD: Cohen's d

## Description

Calculates the Cohen's d measure of effect size.

## Usage

`cohensD( x = NULL, y = NULL, data = NULL, method = "pooled", mu = 0, formula = NULL )`

## Value

Numeric variable containing the effect size, d. Note that it does not show the direction of the effect, only the magnitude. That is, the value of d returned by the function is always positive or zero.

## Argument checking

`cohensD` checks whether the arguments specified by the user make sense. For instance, specifying numeric variables `x` and `y` together with a mean `mu` does not make sense, since the `x` and `y` values imply a two-sample calculation, but `mu` implies a one sample calculation. The cases that the function is "intended" to support are listed below.

The following produce a one-sample Cohen's d:

1. numeric `x`

2. numeric `x`, numeric `mu` of length 1

The following produce a paired-sample Cohen's d:

1. numeric `x`, numeric `y`, `method="paired"`

The following produce a two-sample Cohen's d:

1. numeric `x`, numeric `y`

2. numeric `x`, numeric `y`, valid value for `method` (except "paired")

3. formula `formula`, data frame `data`

4. formula `formula`, data frame `data`, valid value for `method` (except "paired")

5. formula `formula`, valid value for `method` (except "paired")

6. formula `formula`

In a perfect world, these would be the only input combinations allowed. However, because it is commonplace for people to drop argument names, and because I don't want to break backwards compatibility for `cohensD` too much, there are a number of cases where the function attempts to guess the user's intention.

The following produce a paired-sample Cohen's d:

1. formula `x`, `method="paired"` [issues warning]

2. formula `x`, data frame `y`, `method="paired"` [issues warning]

3. formula `x`, data frame `data`, `method="paired"` [issues warning]

4. formula `formula`, data frame `x`, `method="paired"` [issues warning]

5. formula `formula`, data frame `data`, `method="paired"` [issues warning]

6. formula `formula`, `method="paired"` [issues warning]

The following produce a two-sample Cohen's d:

1. formula `x`

2. formula `x`, data frame `y`

3. formula `x`, data frame `data`

4. formula `x`, valid value for `method` (except "paired")

5. formula `x`, data frame `y`, valid value for `method` (except "paired")

6. formula `x`, data frame `data`, valid value for `method` (except "paired")

7. formula `formula`, data frame `x`

8. formula `formula`, data frame `x`, valid value for `method` (except "paired")

## Warning

This package is under development, and has been released only due to teaching constraints. Until this notice disappears from the help files, you should assume that everything in the package is subject to change. Backwards compatibility is NOT guaranteed. Functions may be deleted in future versions and new syntax may be inconsistent with earlier versions. For the moment at least, this package should be treated with extreme caution.

## Details

The `cohensD` function calculates the Cohen's d measure of effect size in one of several different formats. The function is intended to be called in one of two different ways, mirroring the `t.test` function. That is, the first input argument `x` is a formula, then a command of the form `cohensD(x = outcome~group, data = data.frame)` is expected, whereas if `x` is a numeric variable, then a command of the form `cohensD(x = group1, y = group2)` is expected. Note that `cohensD` is not a generic function.

The `method` argument allows the user to select one of several different variants of Cohen's d. Assuming that the original t-test for which an effect size is desired was an independent samples t-test (i.e., not one sample or paired samples t-test), then there are several possibilities for how the normalising term (i.e., the standard deviation estimate) in Cohen's d should be calculated. The most commonly used method is to use the same pooled standard deviation estimate that is used in a Student t-test (`method = "pooled"`, the default). If `method = "raw"` is used, then the same pooled standard deviation estimate is used, except that the sample standard deviation is used (divide by N) rather than the unbiased estimate of the population standard deviation (divide by N-2). Alternatively, there may be reasons to use only one of the two groups to estimate the standard deviation. To do so, use `method = "x.sd"` to select the `x` variable, or the first group listed in the grouping factor; and `method = "y.sd"` to normalise by `y`, or the second group listed in the grouping factor. The last of the "Student t-test" based measures is the unbiased estimator of d (`method = "corrected"`), which multiplies the "pooled" version by (N-3)/(N-2.25).

For other versions of the t-test, there are two possibilities implemented. If the original t-test did not make a homogeneity of variance assumption, as per the Welch test, the normalising term should mirror the Welch test (`method = "unequal"`). Or, if the original t-test was a paired samples t-test, and the effect size desired is intended to be based on the standard deviation of the differences, then `method = "paired"` should be used.

The last argument to `cohensD` is `mu`, which represents the mean against which one sample Cohen's d calculation should be assessed. Note that this is a slightly narrower usage of `mu` than the `t.test` function allows. `cohensD` does not currently support the use of a non-zero `mu` value for a paired-samples calculation.

## References

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates.

`t.test`, `oneSampleTTest`, `pairedSamplesTTest`, `independentSamplesTTest`

## Examples

Run this code
``````# NOT RUN {
# calculate Cohen's d for two independent samples:
gradesA <- c(55, 65, 65, 68, 70) # 5 students with teacher A
gradesB <- c(56, 60, 62, 66)     # 4 students with teacher B

# calculate Cohen's d for the same data, described differently:
grade <- c(55, 65, 65, 68, 70, 56, 60, 62, 66) # grades for all students
teacher <- c("A", "A", "A", "A", "A", "B", "B", "B", "B") # teacher for each student

# calculate Cohen's d for two paired samples:
pre  <- c(100, 122, 97, 25, 274) # a pre-treatment measure for 5 cases
post <- c(104, 125, 99, 29, 277) # the post-treatment measure for the same 5 cases
cohensD(pre, post, method = "paired") # ... explicitly indicate that it's paired, or else
cohensD(post - pre)  # ... do a "single-sample" calculation on the difference

# support for data frames:
exams <- data.frame(grade, teacher)
cohensD(exams\$grade ~ exams\$teacher)    # using \$
cohensD(grade ~ teacher, data = exams)  # using the 'data' argument

# }
``````

Run the code above in your browser using DataCamp Workspace