scrime (version 1.3.5)

pcc: Pearson's Contingency Coefficient

Description

Computes the values of (the corrected) Pearson's contingency coefficient for all pairs of rows of a matrix.

Usage

pcc(x, dist = FALSE, corrected = TRUE, version = 1)

Arguments

x

a numeric matrix consisting of integers between 1 and \(n_{cat}\), where \(n_{cat}\) is the maximum number of levels a variable in x can take.

dist

should the distance based on Pearson's contingency coefficient be computed? For how this distance is computed, see version.

corrected

should Pearson's contingency coefficient be corrected such that it can take values between 0 and 1? If not corrected, it takes values between and 0 and \(\sqrt{(a - 1) / a}\), where \(a\) is the minimum of the numbers of levels that the respective two variables can take. Must be set to TRUE, if dist = TRUE.

version

a numeric value -- either 1, 2, or 3 -- specifying how the distance is computed. Ignored if dist = FALSE. If 1, \(\sqrt{1 - Cont^2}\) is computed, where \(Cont\) denotes Pearson's contigency coefficient. If 2, \(1 - Cont\) is determined, and if 3, \(1 - Cont^2\) is returned.

Value

A matrix with nrow(x) columns and rows containing the values of (or distances based on) the (corrected) Pearson's contigency coefficient for all pairs of rows of x.

See Also

smc

Examples

Run this code
# NOT RUN {
# Generate a data set consisting of 10 rows and 200 columns,
# where the values are randomly drawn from the integers 1, 2, and 3.

mat <- matrix(sample(3, 2000, TRUE), 10)

# For each pair of rows of mat, the value of the corrected Pearson's 
# contingency coefficient is then obtained by

out1 <- pcc(mat)
out1

# and the distances based on this coefficient by

out2 <- pcc(mat, dist = TRUE)
out2

# Note that if version is set to 1 (default) in pcc, then

all.equal(sqrt(1 - out1^2), out2)

# }

Run the code above in your browser using DataLab