biserial.cor: Point-Biserial Correlation

Description

Computes the point-biserial correlation between a dichotomous and a continuous variable.

Usage

biserial.cor(x, y, use = c("all.obs", "complete.obs"), level = 1)

Arguments

a numeric vector representing the continuous variable.

a factor or a numeric vector (that will be converted to a factor) representing the dichotomous variable.

use

If use is "all.obs", then the presence of missing observations will produce an error. If use is "complete.obs" then missing values are handled by casewise deletion.

level

which level of y to use.

Value

the (numeric) value of the point-biserial correlation.

Details

The point biserial correlation computed by biserial.cor() is defined as follows $$r = \frac{(\overline{X}_1 - \overline{X}_0)\sqrt{\pi (1 - \pi)}}{S_x},$$ where $\overline{X}_1$ and $\overline{X}_0$ denote the sample means of the $X$-values corresponding to the first and second level of $Y$, respectively, $S_x$ is the sample standard deviation of $X$, and $\pi$ is the sample proportion for $Y = 1$. The first level of $Y$ is defined by the level argument; see Examples.

Examples

Run this code

# NOT RUN {
# the point-biserial correlation between
# the total score and the first item, using
# '0' as the reference level
biserial.cor(rowSums(LSAT), LSAT[[1]])

# and using '1' as the reference level
biserial.cor(rowSums(LSAT), LSAT[[1]], level = 2)

# }

Run the code above in your browser using DataLab