unidiff: Fitting Log-Multiplicative Uniform Difference/Layer Effect Model

Description

Fit the log-multiplicative uniform difference model (UNIDIFF, see Erikson & Goldthorpe, 1992), also called the log-multiplicative layer effect model (Xie, 1992). For square tables, diagonal cells can be handled separately.

Usage

unidiff(tab, diagonal = c("included", "excluded", "only"),
        constrain = "auto",
        weighting = c("marginal", "uniform", "none"), norm = 2,
        family = poisson,
        tolerance = 1e-8, iterMax = 5000, eliminate=NULL,
        trace = FALSE, verbose = TRUE,
        checkEstimability = TRUE, ...)

Arguments

tab

a three-way table, or an object (such as a matrix) that can be coerced into a table; if present, dimensions above three will be collapsed as appropriate.

diagonal

included fits the standard model with full two-way interaction; excluded adds to this model diagonal-specific parameters for each years, effectively removing the influence of diagonal cells on the layer coefficients; only fits a model without the full two-way interaction, where only diagonal parameters are affected by the layer effect (see “Details” below).

constrain

(non-eliminated) coefficients to constrain, specified by a regular expression, a numeric vector of indices, a logical vector, a character vector of names, or "[?]" to select from a Tk dialog. The default constrains to 0 the first layer parameter and interaction coefficients for the first row and column of the table.

weighting

what weights should be used when normalizing coefficients. This does not affect layer coefficients, which are set to 1 for the first layer, but only two-way interaction coefficients and layer association levels, which are layer coefficients times the intrinsic association coefficient (see maor) for the first layer.

norm

the norm to use to compute the mean absolute odds ratio (see maor).

family

a specification of the error distribution and link function to be used in the model. This can be a character string naming a family function; a family function, or the result of a call to a family function. See family details of family functions.

tolerance

a positive numeric value specifying the tolerance level for convergence; higher values will speed up the fitting process, but beware of numerical instability of estimated scores!

iterMax

a positive integer specifying the maximum number of main iterations to perform; consider raising this value if your model does not converge.

eliminate

either NULL (the default) to estimate all parameters, NA to skip the estimation of some parameters for increased efficiency, or the name of a factor to be passed as gnm's corresponding argument.

trace

a logical value indicating whether the deviance should be printed after each iteration.

verbose

a logical value indicating whether progress indicators should be printed, including a diagnostic error message if the algorithm restarts.

checkEstimability

a logical value indicating whether the estimability of the contrasts should be checked via checkEstimable. Disabling this check can improve performance for large models.

…

more arguments to be passed to gnm

Value

A unidiff object, with all the components of a gnm object, plus an unidiff component holding the most relevant information:

layer

a qvcalc object holding the (log) layer coefficients, their standard errors and quasi-standard errors.

phi

the value of the intrinsic association coefficient (see maor) for each layer.

maor

the value of the Mean absolute odds ratio (see maor) for each layer.

interaction

a data frame object holding the two-way interaction coefficients, and their standard errors.

diagonal

the value of the diagonal argument above.

weighting

the value of the weighting argument above.

Details

The equation of the fitted model is: $$ log F_{ijk} = \lambda + \lambda^I_i + \lambda^J_j + \lambda^K_k + \lambda^{IK}_{ik} + \lambda^{JK}_{jk} + \phi_k \psi^{IJ}_{ij} $$ where $F_{ijk}$ is the expected frequency for the cell at the intersection of row i, column j and layer k of tab. When diagonal = "excluded", $\lambda^{IJK}_{ijk}$ parameters are added but set to 0 when $i \neq j$ (off-diagonal). When diagonal = "only", $\psi^{IJ}_{ij}$ is set to 0 when $i \neq j$.

Layer coefficients $\phi_k$ are internally exponentiated in the gnm formula, which means the reported values are in log scale, with reference 0 for the first year. Interaction coefficients use the “sum” contrast, also known as “effect” coding, except when diagonal is different from included, in which case “treatment” constrast (a.k.a “reference” or “dummy” coding) is used.

Actual model fitting is performed using gnm, which implements the Newton-Raphson algorithm. This function simply allows for direct identification of the log-multiplicative parameters by setting the appropriate constraints, and improves performance by eliminating less interesting coefficients.

References

Erikson, R., and Goldthorpe, J.H. (1992). The Constant Flux: A Study of Class Mobility in Industrial Societies. Oxford: Clarendon Press. Ch. 3.

Xie, Yu (1992). The Log-Multiplicative Layer Effect Model for Comparing Mobility Tables. Am. Sociol. Rev. 57(3):380-395.

Yaish, M. (1998). Opportunities, Little Change. Class Mobility in Israeli Society, 1974-1991. Ph.D. thesis, Nuffield College, University of Oxford.

Yaish, M. (2004). Class Mobility Trends in Israeli Society, 1974-1991. Lewiston: Edwin Mellen Press.

Examples

Run this code

# NOT RUN {
  ## Yaish (1998, 2004)
  data(yaish)

  # Last layer omitted because of low frequencies
  yaish <- yaish[,,-7]

  # Layer (education) must be the third dimension
  yaish <- aperm(yaish, 3:1)

  model <- unidiff(yaish)

  model
  summary(model)
  plot(model)

  
# }

Run the code above in your browser using DataLab