loglinear: Loglinear Smoothing

Description

This function smooths univariate and bivariate score distributions via polynomial loglinear modeling (Holland & Thayer, 2000; Moses & von Davier, 2006).

Usage

loglinear(x, scorefun, degree, raw = TRUE, convergecrit = .0001, verbose = TRUE, ...)

Arguments

score distribution of class freqtab. Under the random groups design (i.e., no anchor test) x will contain the score scale in column 1, and the number of examinees obtaining eac

scorefun

matrix of score functions, where each column represents a transformation of the score scale (or the crossed score scales, in the bivariate case)

degree

integer indicating a maximum polynomial transformation to be computed (passed to poly; ignored if scorefun is provided)

raw

logical. If TRUE (default), raw polynomials will be used, if FALSE, orthogonal (passed to poly)

convergecrit

convergence criteria used in maximum likelihood estimation (default is .0001)

verbose

logical, with default TRUE, indicating whether or not output in addition to the fitted values should be returned

...

further arguments passed to or from other methods

Value

Returns a list including the following components:
modelfittable of model fit statistics: likelihood ratio chi-square, Pearson chi-square, Freeman-Tukey chi-square, AIC, and CAIC
rawbetastwo-column matrix of raw maximum likelihood estimates for the beta coefficients and corresponding standard errors
alphanormalizing constant
iterationsnumber of iterations reached before convergence
fitted.valuesvector of estimated frequencies
residualsvector of residuals
cmatrixthe C matrix, a factorization of the covariance matrix of the fitted values
scorefunmatrix of score functions

Details

Loglinear smoothing is a flexible procedure for reducing irregularities in a raw score distribution. The loglinear function fits a polynomial loglinear model to a distribution of scores, where the degree of each polynomial term determines the specific moment of the raw distribution that is preserved in the fitted distribution (see below for examples). scorefun must contain at least one score function of the scale score values. While there is no explicit limit on the number of columns in scorefun, models with more than ten may not converge on a solution. Specifying degree is an alternative to scorefun. It takes a maximum polynomial degree and constructs the score functions accordingly. For example, degree=3 will result in a model with three terms: the score scale raised to the first, second, and third powers. In the bivariate case powers 1 through 3 would be included for each variable. Maximum likelihood estimates are obtained using a Newton-Raphson algorithm, with slightly smoothed frequencies (all nonzero) as the basis for starting values. Calculating standard errors for these estimates requires matrix inversion, which for complex models may not be possible. In this case the standard errors will be omitted. The tolerance level for detecting singularity may be modified with the argument tol, which is passed to solve(). For a detailed description of the estimation procedures, including examples, see Holland and Thayer, 1987 and 2000. For a more recent discussion, including the macro after which the loglinear function is modeled, see Moses and von Davier, 2006.

References

Holland, P. W., and Thayer, D. T. (1987). Notes on the use of log-linear models for fitting discrete probability distributions (PSR Technical Rep. No. 87-79; ETS RR-87-31). Princeton, NJ: ETS. Holland, P. W., and Thayer, D. T. (2000). Univariate and bivariate loglinear models for discrete test score distributions. Journal of Educational and Behavioral Statistics, 25, 133--183. Moses, T. P., and von Davier, A. A. (2006). A SAS macro for loglinear smoothing: Applications and implications (ETS Research Rep. No. RR-06-05). Princeton, NJ: ETS. Wang, T. (2009). Standard errors of equating for the percentile rank-based equipercentile equating with log-linear presmoothing. Journal of Educational and Behavioral Statistics, 34, 7--23.

Examples

Run this code

set.seed(2010)
x <- round(rnorm(1000, 100, 15))
xscale <- 50:150

# smooth x preserving first 3 moments:
xtab <- freqtab(xscale, x)
xlog1 <- loglinear(xtab, degree = 3)
plot(xtab, cbind(xscale, xlog1$fit), col1 = 2, col2 = 4)

# add "teeth" and "gaps" to x:
teeth <- c(.5, rep(c(1, 1, 1, 1, .5), 20))
xt <- xtab[, 2] * teeth
cbind(xtab, xt)
xttab <- as.freqtab(xscale, xt)
xlog2 <- loglinear(xttab, degree = 3)
cbind(xscale, xt, xlog2$fit)

# smooth xt using score functions that preserve 
# the teeth structure (also 3 moments):
teeth2 <- c(1, rep(c(0, 0, 0, 0, 1), 20))
xt.fun <- cbind(xscale, xscale^2, xscale^3)
xt.fun <- cbind(xt.fun, xt.fun * teeth2)
xlog3 <- loglinear(xttab, xt.fun)
cbind(xscale, xt, xlog3$fit)

par(mfrow = c(2, 1))
plot(xscale, xt, type = "h", ylab = "count",
  main = "X teeth raw")
plot(xscale, xlog3$fit, type = "h", ylab = "count",
  main = "X teeth smooth")

# bivariate example, preserving first 3 moments of total
# and v (anchor) each of x and y, and the covariance
# between anchor and total
# see equated scores in Wang (2009), Table 4
xscale <- 0:36
vscale <- 0:12
xvtab <- freqtab(xscale, KBneat$x[, 1],
  vscale, KBneat$x[, 2])
yvtab <- freqtab(xscale, KBneat$y[, 1],
  vscale, KBneat$y[, 2])
Y <- yvtab[, 1]
V <- yvtab[, 2]
scorefun <- cbind(Y, Y^2, Y^3, V, V^2, V^3, V * Y)
wang09 <- equate(xvtab, yvtab, type = "equip",
  method = "chained", smooth = "loglin", xscorefun = scorefun, 
  yscorefun = scorefun)
wang09$concordance

# replicate Moses and von Davier, 2006, univariate example:
uv <- c(0, 4, 11, 16, 18, 34, 63, 89, 87, 129, 124,
  154, 125, 131, 109, 98, 89, 66, 54, 37, 17)
loglinear(as.freqtab(0:20, uv), degree = 3)

Run the code above in your browser using DataLab