Learn R Programming

scrime (version 1.2.9)

rowChisqStats: Rowwise Pearson's ChiSquare Statistic

Description

Computes for each row of a matrix the value of Pearson's ChiSquare statistic for testing if the corresponding categorical variable is associated with a (categorical) response, or determines for each pair of rows of a matrix the value of Pearson's ChiSquare statistic for testing if the two corresponding variables are independent.

Usage

rowChisqStats(data, cl, compPval = TRUE, asMatrix = TRUE)

Arguments

data
a numeric matrix consisting of the integers between 1 and $n_{cat}$, where $n_{cat}$ is the maximum number of levels the categorical variables can take. Each row of data must correspond to a variable, each row to an observation.
cl
a numeric vector of length ncol(data) containing the class labels for the observations represented by the columns of data. The class labels must be coded by the integers between 1 and $n_{cl}$, where $n_{cl}$ is the n
compPval
should also the p-value (based on the approximation to a $\chi^2$-distribution) be computed?
asMatrix
should the pairwise test scores be returned as matrix? Ignored if cl is specified. If TRUE, a matrix with $m$ rows and columns is returned that contains the values of Pearson's $\chi^2$-statistic in its lower trian

Value

  • If compPval = FALSE, a vector (or matrix if cl is not specified and as.matrix = TRUE) composed of the values of Pearson's $\chi^2$-statistic. Otherwise, a list consisting of
  • statsa vector (or matrix) containing the values of Pearson's $\chi^2$-statistic.
  • dfa vector (or matrix) comprising the degrees of freedom of the asymptotic $\chi^2$-distribution.
  • rawpa vector (or matrix) containing the (unadjusted) p-values.

References

Schwender, H. (2007). A Note on the Simultaneous Computation of Thousands of Pearson's $\chi^2$-Statistics. Technical Report, SFB 475, Deparment of Statistics, University of Dortmund.

See Also

computeContCells, computeContClass

Examples

Run this code
# Generate an example data set consisting of 5 rows (variables)
# and 200 columns (observations) by randomly drawing integers 
# between 1 and 3.

mat <- matrix(sample(3, 1000, TRUE), 5)
rownames(mat) <- paste("SNP", 1:5, sep = "")

# For each pair of rows of mat, test if they are independent.

r1 <- rowChisqStats(mat)

# The values of Pearson's ChiSquare statistic as matrix.

r1$stats

# And the corresponding (unadjusted) p-values.

r1$rawp

# Obtain only the values of the test statistic as vector

rowChisqStats(mat, compPval = FALSE, asMatrix =FALSE)


# Generate an example data set consisting of 10 rows (variables)
# and 200 columns (observations) by randomly drawing integers 
# between 1 and 3, and a vector of class labels of length 200
# indicating that the first 100 observation belong to class 1
# and the other 100 to class 2. 

mat2 <- matrix(sample(3, 2000, TRUE), 10)
cl <- rep(1:2, e = 100)

# For each row of mat2, test if they are associated with cl.

r2 <- rowChisqStats(mat2, cl)
r2$stats

# And the results are identical to the one of chisq.test
pv <- stat <- numeric(10)
for(i in 1:10){
    tmp <- chisq.test(mat2[i,], cl)
    pv[i] <- tmp$p.value
    stat[i] <- tmp$stat
}

all.equal(r2$stats, stat)
all.equal(r2$rawp, pv)

Run the code above in your browser using DataLab