MRCV-package: Statistical Methods for Analyzing the Association Among Two or Three MRCVs
Description
The MRCV package provides functions for analyzing the association between two or three multiple response categorical variables (MRCVs). A modified Pearson chi-square statistic can be used to test for marginal independence betwen two MRCVs, or a more general loglinear modeling approach can be used to compare various other structures of association among two or three MRCVs. Bootstrap- and asymptotic-based standardized residuals and model-predicted odds ratios are available, in addition to other descriptive information.Details
ll{
Package: MRCV
Version: 0.1-0
Date: 2013-06-22
Depends: tables, R (>= 3.0.0)
LazyData: TRUE
License: GPL (>= 3)
}
Notation:
Define row variable, W, column variable, Y, and strata variable, Z, as MRCVs with binary items (i.e., categories) Wi for i = 1, ..., I, j = 1, ..., J, and k = 1, ..., K, respectively. Also, define a marginal count as the number of subjects who responded (Wi=a, Yj=b, Zk=c) for a, b, and c belonging to the set {0, 1}.
Format of Data Frame:
Many of the functions require a data frame containing the raw data structured such that n rows correspond to the individual item response vectors, and the columns correspond to the binary items, W1, ..., WI, Y1, ..., YJ, and Z1, ..., ZK (in this order). The third set of items is only necessary when analyzing the relationship among three MRCVs. Some of the functions use a summary version of the raw data frame (converted automatically without need for user action) formatted to have 2Ix2J rows (or 2Ix2Jx2K rows in the three MRCV case) and 5 columns generically named W, Y, wi, yj, and count (or 7 columns generically named W, Y, Z, wi, yj, zk, and count in the three MRCV case). The column named count contains the marginal counts defined above.
Descriptive Functions:
Users can call the item.response.table function to obtain a cross-tabulation of the positive and negative responses for each combination of items, or the marginal.table function to obtain a cross-tabulation of only the positive responses.
Functions to Test for Marginal Independence:
Methods proposed by Agresti and Liu (1999), Bilder and Loughin (2004), and Thomas and Decady (2004) are implemented using the SPMI.test function. This function calculates a modified Pearson chi-square statistic that can be used to test for simultaneous pairwise marginal independence (SPMI) between two MRCVs. SPMI is a test of whether each Wi is pairwise independent of each Yj. The modified statistic is the sum of the IxJ Pearson statistics used to test for independence of each (Wi, Yj) pair. The asymptotic distribution of the modified statistic is a linear combination of independent chi-square(1) random variables, so traditional methods for analyzing the association between categorical variables W and Y are inappropriate. The SPMI.test function offers three methods, a nonparametric bootstrap approach, a Rao-Scott second-order adjustment, and a Bonferroni adjustment, that can be used in conjunction with the modified statistic to construct an appropriate test for independence.
Functions for Performing Regression Modeling:
Regression modeling methods described by Bilder and Loughin (2007) are implemented using genloglin and methods summary.genloglin, residuals.genloglin, anova.genloglin, and predict.genloglin. The genloglin function provides parameter estimates and Rao-Scott adjusted standard errors for models involving two or three MRCVs. The anova.genloglin function offers second-order Rao-Scott and bootstrap adjusted model comparison and goodness-of-fit (Pearson and LRT) statistics. The residuals.genloglin and predict.genloglin functions provide bootstrap- and asymptotic-based standardized Pearson residuals and model-based odds ratios, respectively.References
Agresti, A. and Liu, I.-M. (1999) Modeling a categorical variable allowing arbitrarily many category choices. Biometrics, 55, 936--943.
Bilder, C. and Loughin, T. (2004) Testing for marginal independence between two categorical variables with multiple responses. Biometrics, 36, 433--451.
Bilder, C. and Loughin, T. (2007) Modeling association between two or more categorical variables that allow for multiple category choices. Communications in Statistics--Theory and Methods, 36, 433--451.
Thomas, D. and Decady, Y. (2004) Testing for association using multiple response survey data: Approximate procedures based on the Rao-Scott approach. International Journal of Testing, 4, 43--59.