Learn R Programming

SNSequate (version 1.0)

bandwidth: Automatic selection of the bandwidth parameter h

Description

This functions implements the minimization of the combined penalty function described by Holland and Thayer (1989); Von Davier et al, (2004). It returns the optimal value of h for kernel continuization, according to the above mentioned criteria. Different types of kernels (others than the gaussian) are accepted.

Usage

bandwidth(scores, kert, degree, design, Kp = 1, scores2, degreeXA, degreeYA, 
J, K, L, wx, wy, w, ...)

Arguments

Note that depending on the specified equating design, not all arguments are necessary as detailed below.
scores
If the "EG" design is specified, a vector containing the raw sample frequencies coming from one group taking the test. If the "SG" design is specified, a matrix containing the (joint) bivariate sample frequencies for $X$ (raws) and $Y$
kert
A character string giving the type of kernel to be used for continuization. Current options include "gauss", "logis", and "uniform" for the gaussian, logistic and uniform kernels, respectively
degree
Either a number or vector indicating the number of power moments to be fitted to the marginal distributions, or the number or cross moments to be fitted to the joint distributions, respectively. For the "EG" design it will be a number (see Detai
design
A character string indicating the equating design (one of "EG", "SG", "CB", "NEAT_CE", "NEAT_PSE")
Kp
A number which acts as a weight for the second term in the combined penalization function used to obtain h (see details).
scores2
Only used for the "CB", "NEAT_CE" and "NEAT_PSE" designs. See the description of scores.
degreeXA
A vector indicating the number of power moments to be fitted to the marginal distributions $X$ and $A$, and the number or cross moments to be fitted to the joint distribution $(X,A)$ (see details). Only used for the "NEAT_CE" and "NEAT_PSE" desi
degreeYA
Only used for the "NEAT_CE" and "NEAT_PSE" designs (see the description for degreeXA)
J
The number of possible $X$ scores. Only needed for "CB", "NEAT_CB" and "NEAT_PSE" designs
K
The number of possible $Y$ scores. Only needed for "CB", "NEAT_CB" and "NEAT_PSE" designs
L
The number of possible $A$ scores. Needed for "NEAT_CB" and "NEAT_PSE" designs
wx
A number that satisfies $0\leq w_X\leq 1$ indicating the weight put on the data that is not subject to order effects. Only used for the "CB" design.
wy
A number that satisfies $0\leq w_Y\leq 1$ indicating the weight put on the data that is not subject to order effects. Only used for the "CB" design.
w
A number that satisfies $0\leq w\leq 1$ indicating the weight given to population $P$. Only used for the "NEAT" design.
...
Further arguments currently not used.

Value

  • A number which is the optimal value of h.

Details

To automatically select h, the function minimizes $$PEN_1(h)+K\times PEN_2(h)$$ where $PEN_1(h)=\sum_j(\hat{r}_j-\hat{f}_h(x_j))^2$, and $PEN_2(h)=\sum_jA_j(1-B_j)$. The terms $A$ and $B$ are such that $PEN_2$ acts as a smoothness penalty term that avoids rapid fluctuations in the approximated density (see Chapter 10 in Von Davier, 2011 for more details). The $K$ term corresponds to the Kp argument of the bandwidth function. The $\hat{r}$ values are assumed to be estimated by polynomial loglinear models of specific degree, which come from a call to loglin.smooth.

References

Von Davier, A., Holland, P., and Thayer, D. (2004). The Kernel Method of Test Equating. New York, NY: Springer-Verlag. A. von Davier (Ed.) (2011). Statistical Models for Equating, Scaling, and Linking. New York: Springer

See Also

loglin.smooth

Examples

Run this code
#Example: The "Standard" column and firsts two rows of Table 10.1 in 
#Chapter 10 of Von Davier 2011

data(Math20EG)

hx.logis<-bandwidth(scores=Math20EG[,1],kert="logis",degree=2,design="EG")$h
hx.unif<-bandwidth(scores=Math20EG[,1],kert="unif",degree=2,design="EG")$h 
hx.gauss<-bandwidth(scores=Math20EG[,1],kert="gauss",degree=2,design="EG")$h

hy.logis<-bandwidth(scores=Math20EG[,2],kert="logis",degree=3,design="EG")$h
hy.unif<-bandwidth(scores=Math20EG[,2],kert="unif",degree=3,design="EG")$h 
hy.gauss<-bandwidth(scores=Math20EG[,2],kert="gauss",degree=3,design="EG")$h

partialTable10.1<-rbind(c(hx.logis,hx.unif,hx.gauss),
				c(hy.logis,hy.unif,hy.gauss))

dimnames(partialTable10.1)<-list(c("h.x","h.y"),c("Logistic","Uniform","Gaussian"))
partialTable10.1

Run the code above in your browser using DataLab