loglinear: Polynomial Loglinear Smoothing

Description

This function calls on glm to fit one or more loglinear smoothing models to a frequency distribution. Model fit can also be compared for nested models.

Usage

loglinear(x, scorefun, degree, raw = TRUE, verbose = FALSE,
	compare = FALSE, stepup = compare, showWarnings = FALSE, ...)

Arguments

score distribution of class freqtab. With an equivalent groups design (i.e., no anchor test), x will contain the score scale in column 1 and the number of examinees obtaining e

scorefun

matrix of score functions, where each column includes a transformation of the score scale (or the crossed score scales, in the bivariate case). If missing, degree will be used to construct polynomial score functions.

degree

integer indicating a maximum polynomial score transformation to be computed (passed to poly; ignored if scorefun is provided).

raw

logical, passed to poly, indicating whether raw polynomials (TRUE, default) or orthogonal polynomials (FALSE) will be used.

verbose

logical, with default TRUE, indicating whether or not glm output in addition to the fitted values should be returned.

compare

logical, with default FALSE, indicating whether or not fit for nested models should be compared. If TRUE, stepup is also set to TRUE and only results from the model fit comparison are returned.

stepup

logical, with default FALSE, indicating whether or not all nested models should also be run.

showWarnings

logical, with default FALSE, indicating whether or not warnings from glm should be shown.

...

further arguments passed to glm.

Value

Returns either an anova fit table (when compare = TRUE), a vector or matrix of fitted values (when verbose = FALSE), or complete output from glm (when verbose = TRUE) including for nested models (when stepup = TRUE).

Details

Loglinear smoothing is a flexible procedure for reducing irregularities in a frequency distribution prior to equating, where the degree of each polynomial term determines the specific moment of the raw distribution that is preserved in the fitted distribution (see below for examples). The loglinear function is a simple wrapper for glm, and is used to simplify the creation of polynomial score functions and the fitting and comparing of multiple loglinear models. scorefun must contain at least one score function of the scale score values.

Specifying degree is an alternative to supplying scorefun. degree takes a maximum polynomial degree and constructs the score functions accordingly. For example, degree = 3 will result in a model with three terms: the score scale raised to the first, second, and third powers, preserving the mean, variance, and skew of the original distribution. In the bivariate case, powers 1 through 3 would be included for each variable.

stepup is used to run nested models based on subsets of the columns in scorefun. Output will correspond to models based on columns 1 and 2, 1 through 3, 1 through 4, etc.

compare returns output from anova, comparing model fit for all the models run with stepup = TRUE.

For additional examples, see Holland and Thayer, 1987 and 2000.

References

Holland, P. W., and Thayer, D. T. (1987). Notes on the use of log-linear models for fitting discrete probability distributions (PSR Technical Rep. No. 87-79; ETS RR-87-31). Princeton, NJ: ETS.

Holland, P. W., and Thayer, D. T. (2000). Univariate and bivariate loglinear models for discrete test score distributions. Journal of Educational and Behavioral Statistics, 25, 133--183.

Moses, T. P., and von Davier, A. A. (2006). A SAS macro for loglinear smoothing: Applications and implications (ETS Research Rep. No. RR-06-05). Princeton, NJ: ETS.

Wang, T. (2009). Standard errors of equating for the percentile rank-based equipercentile equating with log-linear presmoothing. Journal of Educational and Behavioral Statistics, 34, 7--23.

Examples

Run this code

set.seed(2010)
x <- round(rnorm(1000, 100, 15))
xscale <- 50:150

# smooth x preserving first 3 moments:
xtab <- freqtab(x, xscale = xscale)
xlog1 <- loglinear(xtab, degree = 3)
plot(xtab, y = xlog1)
lines(xtab[, 1], xlog1)

# add "teeth" and "gaps" to x:
teeth <- c(.5, rep(c(1, 1, 1, 1, .5), 20))
xt <- xtab[, 2] * teeth
cbind(xtab, xt)
xttab <- as.freqtab(cbind(xscale, xt))
xlog2 <- loglinear(xttab, degree = 3)
cbind(xscale, xt, xlog2)

# smooth xt using score functions that preserve 
# the teeth structure (also 3 moments):
teeth2 <- c(1, rep(c(0, 0, 0, 0, 1), 20))
xt.fun <- cbind(xscale, xscale^2, xscale^3)
xt.fun <- cbind(xt.fun, xt.fun * teeth2)
xlog3 <- loglinear(xttab, xt.fun)
cbind(xscale, xt, xlog3)

par(mfrow = c(2, 1))
plot(xscale, xt, type = "h", ylab = "count",
	main = "X teeth raw")
plot(xscale, xlog3, type = "h", ylab = "count",
	main = "X teeth smooth")

# bivariate example, preserving first 3 moments of total
# and v (anchor) each of x and y, and the covariance
# between anchor and total
# see equated scores in Wang (2009), Table 4
xvtab <- freqtab(KBneat$x[, 1], KBneat$x[, 2],
	xscale = 0:36, vscale = 0:12)
yvtab <- freqtab(KBneat$y[, 1], KBneat$y[, 2],
	xscale = 0:36, vscale = 0:12)
Y <- yvtab[, 1]
V <- yvtab[, 2]
scorefun <- cbind(Y, Y^2, Y^3, V, V^2, V^3, V * Y)
wang09 <- equate(xvtab, yvtab, type = "equip",
	method = "chained", smooth = "loglin", xscorefun = scorefun, 
	yscorefun = scorefun)
wang09$concordance

# replicate Moses and von Davier, 2006, univariate example:
uv <- c(0, 4, 11, 16, 18, 34, 63, 89, 87, 129, 124,
	154, 125, 131, 109, 98, 89, 66, 54, 37, 17)
loglinear(as.freqtab(cbind(0:20, uv)), degree = 3)

Run the code above in your browser using DataLab