rowMsquares: Rowwise Linear Trend Test Based on Tables

Description

Given a set of matrices, each representing one group of subjects (e.g., cases and controls in a case-control study), that summarize the numbers of subjects showing the different levels of the categorical variables represented by the rows of the matrices, the value of the linear trend statistic based on Pearson's correlation coefficient and described on page 87 of Agresti (2002) is computed for each variable.

Using this function instead of rowTrendStats is in particular recommended when the total number of observations is very large.

Usage

rowMsquares(..., listTables = NULL, clScores = NULL, levScores = NULL,
   add.pval = TRUE)

Arguments

…

numeric matrices in each of which each row corresponds to a ordinal variable and each column to one of the ordered levels of these variables. Each of these matrices represents one of the groups of interest and comprises the numbers of observations showing the respective levels at the different variables. These matrices can, e.g., generated by employing rowTables. The dimensions of all matrices must be the same, and the rows and columns must represent the same variables and levels, respectively, in the same order in all matrices. The rowwise sums in a matrix are allowed to differ (which might happen if some of the observations are missing for some of the variables.)

listTables

instead of inputting the matrices directly, a list consisting of these matrices can be generated and then be used in rowMsquares by specifying listTables.

clScores

a numeric vector with one entry for each matrix specifying the score that should be assigned to the corresponding group. If NULL, clScores is set to 1:m, where \(m\) is the number of groups/matrices, such that the first input matrix (or the first entry in listTables) gets a score of 1, the second a score of 2, and so on.

levScores

a numeric vector with one score for each level of the variables.If not specified, i.e.\ NULL, the column names of the matrices are interpreted as scores.

add.pval

should p-values be added to the output? If FALSE, only the rowwise values of the linear trend test statistic will be returned. If TRUE, additionally the (raw) p-values based on an approximation to the ChiSquare-distribution with 1 degree of freedom are returned.

Value

Either a vector containing the rowwise values of the linear trend test statistic (if add.pval = FALSE), or a list containing these values (stats), and the (raw) p-values (rawp) not adjusted for multiple comparisons (if add.pval = TRUE).

Details

This is an extension of the Cochran-Armitage trend test from two to several classes. The statistic of the Cochran-Armitage trend test can be obtained by multiplying the statistic of this general linear trend test with \(n / (n - 1)\), where \(n\) is the number of observations.

References

Agresti, A.\ (2002). Categorical Data Analysis. Wiley, Hoboken, NJ. 2nd Edition.

Mantel, N.\ (1963). Chi-Square Test with one Degree of Freedom: Extensions of the Mantel-Haenszel Procedure. Journal of the American Statistical Association, 58, 690-700.

Examples

Run this code

# NOT RUN {
# Generate a matrix containing data for 10 categorical 
# variables with levels 1, 2, 3.

mat <- matrix(sample(3, 500, TRUE), 10)

# Now assume that we consider a case-control study,
# i.e. two groups, and that the first 25 columns 
# of mat correspond to cases and the remaining 25 
# columns to cases. Then a vector containing the 
# class labels is given by

cl <- rep(1:2, e=25)

# and the matrices summarizing the numbers of subjects
# showing the respective levels at the different variables
# are computed by

cases <- rowTables(mat[, cl==1])
controls <- rowTables(mat[,cl==2])

# The values of the rowwise liner trend test are
# computed by

rowMsquares(cases, controls)

# which leads to the same results as

listTab <- list(cases, controls)
rowMsquares(listTables = listTab)

# or as

rowTrendStats(mat, cl, use.n = FALSE)

# or as

out <- rowCATTs(cases, controls)
n <- ncol(mat)
out$stats * (n - 1) / n

 
# }

Run the code above in your browser using DataLab