psych (version 2.1.9)

phi: Find the phi coefficient of correlation between two dichotomous variables


Given a 1 x 4 vector or a 2 x 2 matrix of frequencies, find the phi coefficient of correlation. Typical use is in the case of predicting a dichotomous criterion from a dichotomous predictor.


phi(t, digits = 2)



a 1 x 4 vector or a 2 x 2 matrix


round the result to digits


phi coefficient of correlation


In many prediction situations, a dichotomous predictor (accept/reject) is validated against a dichotomous criterion (success/failure). Although a polychoric correlation estimates the underlying Pearson correlation as if the predictor and criteria were continuous and bivariate normal variables, and the tetrachoric correlation if both x and y are assumed to dichotomized normal distributions, the phi coefficient is the Pearson applied to a matrix of 0's and 1s.

The phi coefficient was first reported by Yule (1912), but should not be confused with the Yule Q coefficient.

For a very useful discussion of various measures of association given a 2 x 2 table, and why one should probably prefer the Yule Q coefficient, see Warren (2008).

Given a two x two table of counts

a b a+b (R1)
c d c+d (R2)

convert all counts to fractions of the total and then Phi = [a- (a+b)*(a+c)]/sqrt((a+b)(c+d)(a+c)(b+d) ) = (a - R1 * C1)/sqrt(R1 * R2 * C1 * C2)

This is in contrast to the Yule coefficient, Q, where Q = (ad - bc)/(ad+bc) which is the same as [a- (a+b)*(a+c)]/(ad+bc)

Since the phi coefficient is just a Pearson correlation applied to dichotomous data, to find a matrix of phis from a data set involves just finding the correlations using cor or lowerCor or corr.test.


Warrens, Matthijs (2008), On Association Coefficients for 2x2 Tables and Properties That Do Not Depend on the Marginal Distributions. Psychometrika, 73, 777-789.

Yule, G.U. (1912). On the methods of measuring the association between two attributes. Journal of the Royal Statistical Society, 75, 579-652.

See Also

phi2tetra, AUC, Yule, Yule.inv Yule2phi, comorbidity, tetrachoric and polychoric


Run this code
x <- matrix(c(40,5,20,20),ncol=2)

# }

Run the code above in your browser using DataLab