genloglin: Model the Association Among Two or Three MRCVs

Description

The genloglin function uses a generalized loglinear modeling approach to estimate the association among two or three MRCVs. Standard errors are adjusted using a second-order Rao-Scott approach.

Usage

genloglin(data, I, J, K = NULL, model, add.constant = 0.5, boot = TRUE, 
    B = 1999, B.max = B, print.status = TRUE)
  
genloglin.fit(data, model, nvars, limit.output = FALSE, model.vars = NULL)

Arguments

data

For genloglin: A data frame containing the raw data where rows correspond to the individual item response vectors, and columns correspond to the binary items, W1, ..., WI, Y1, ..., YJ, and Z1, ..., ZK (in this order). For genloglin.f

The number of items corresponding to row variable W.

The number of items corresponding to column variable Y.

The number of items corresponding to strata variable Z.

model

For the two MRCV case, a character string specifying one of the following models: "spmi" (the complete independence model), "homogeneous" (the homogeneous association model), "w.main" (the w-main effects model),

add.constant

A positive constant to be added to all zero marginal cell counts.

boot

A logical value indicating whether bootstrap resamples should be taken.

The desired number of bootstrap resamples.

B.max

The maximum number of bootstrap resamples. Resamples for which at least one item has all positive or negative responses are thrown out; genloglin uses the first B valid resamples or all valid resamples if that number is less tha

print.status

A logical value indicating whether progress updates should be provided. When print.status = TRUE, the status of the IPF algorithm is printed after every 5 iterations. Upon completion of the IPF algorithm, a progress bar appears that documen

nvars

The number of MRCVs.

limit.output

A logical value indicating whether only partial model fit information should be returned. Used solely for programming convenience when performing the bootstrap.

model.vars

For observed data: model.vars = NULL. For bootstrap resamples: For the two MRCV case, a data frame containing 2Ix2J rows and 4 columns generically named W, Y, wi, and yj. For the three M

Value

--- genloglin returns an object of class 'genloglin'. The object is a list containing at least the following objects: original.arg, mod.fit, sum.fit, and rs.results. original.arg is a list containing the following objects:
- data:
{The original data frame supplied to the data argument.}
I:
{The original value supplied to the I argument.}
J:
{The original value supplied to the J argument.}
K:
{The original value supplied to the K argument.}
nvars:
{The number of MRCVs.}
model:
{The original value supplied to the model argument.}
add.constant:
{The original value supplied to the add.constant argument.}
boot:
{The original value supplied to the boot argument.}

code

limit.output = TRUE

itemize

B.use:

item

E:
gamma:
B.discard:
model.data.star:
mod.fit.star:
chisq.star:
lrt.star:
residual.star:

Details

The genloglin function first converts the raw data into a form that can be used for estimation. For the two MRCV case, the reformatted data frame contains 2Ix2J rows and 5 columns generically named W, Y, wi, yj, and count. For the three MRCV case, the reformatted data frame contains 2Ix2Jx2K rows and 7 columns generically named W, Y, Z, wi, yj, zk, and count. Then, the model of interest is estimated by calling the genloglin.fit function which in turn calls the glm function where the family argument is specified as poisson. For all predictor variables, the first level is the reference group (i.e., 1 is the reference for variables W, Y, and Z, and 0 is the reference for variables wi, yj, and zj). Because the model is fit to the marginal counts and the marginal counts do not actually follow a multinomial distribution, standard errors and model fit information need to be adjusted. For this reason, the genloglin.fit function is not normally called directly. The boot argument must equal TRUE in order to obtain bootstrap results with the genloglin method functions.

References

Bilder, C. and Loughin, T. (2007) Modeling association between two or more categorical variables that allow for multiple category choices. Communications in Statistics--Theory and Methods, 36, 433--451.

Examples

Run this code

# Estimate the y-main effects model for 2 MRCVs
mod.fit <- genloglin(data = farmer2, I = 3, J = 4, model = "y.main", boot = FALSE)
# Summarize model fit information
summary(mod.fit)
# Examine standardized Pearson residuals
residuals(mod.fit)
# Compare the y-main effects model to the saturated model
anova(mod.fit, model.HA = "saturated", type = "rs2")
# Obtain observed and model-predicted odds ratios
predict(mod.fit)

# Estimate a model that is not one of the named models
# Note that this was the final model chosen by Bilder and Loughin (2007)
mod.fit.other <- genloglin(data = farmer2, I = 3, J = 4, model = count ~ -1 + W:Y + 
    wi%in%W:Y + yj%in%W:Y + wi:yj + wi:yj%in%Y + wi:yj%in%W3:Y1, boot = 
    FALSE)
# Compare this model to the y-main effects model
anova(mod.fit, model.HA = count ~ -1 + W:Y + wi%in%W:Y + yj%in%W:Y + wi:yj + 
    wi:yj%in%Y + wi:yj%in%W3:Y1, type = "rs2", gof = TRUE)

# To obtain bootstrap results from the method functions the genloglin() boot 
# argument must be specified as TRUE (the default)
# A small B is used for demonstration purposes; normally, a larger B should be used
mod.fit <- genloglin(data = farmer2, I = 3, J = 4, model = "y.main", boot = TRUE, 
    B = 99)
residuals(mod.fit)
anova(mod.fit, model.HA = "saturated", type = "all")
predict(mod.fit)

# Estimate a model for 3 MRCVs
mod.fit.three <- genloglin(data = farmer3, I = 3, J = 4, K = 5, model = count ~ 
    -1 + W:Y:Z + wi%in%W:Y:Z + yj%in%W:Y:Z + zk%in%W:Y:Z + wi:yj + 
    wi:yj%in%Y + wi:yj%in%W + wi:yj%in%Y:W + yj:zk + yj:zk%in%Z + 
    yj:zk%in%Y + yj:zk%in%Z:Y, boot = TRUE, B = 99)
residuals(mod.fit.three)
anova(mod.fit.three, model.HA = "saturated", type = "all")
predict(mod.fit.three, pair = "WY")

Run the code above in your browser using DataLab