Learn R Programming

lfe (version 1.7-1416)

demeanlist: Centre vectors on multiple groups

Description

Uses the method of alternating projections to centre a (model) matrix on multiple groups, as specified by a list of factors. This function is called by felm, but it has been made available as standalone in case it's needed.

Usage

demeanlist(mtx, fl, icpt=0, eps=getOption('lfe.eps'),
           threads=getOption('lfe.threads'),
           progress=getOption('lfe.pint'),
           accel=getOption('lfe.accel'),
           randfact=TRUE,
           means=FALSE)

Arguments

mtx
matrix whose columns form vectors to be group-centred. mtx may also be a list of vectors or matrices.
fl
list of factors defining the grouping structure
icpt
the position of the intercept, this column is removed from the result matrix
eps
a tolerance for the centering
threads
an integer specifying the number of threads to use
progress
integer. If positive, make progress reports (whenever a vector is centered, but not more often than every progress minutes)
accel
integer. Set to 1 if Gearhart-Koshy acceleration should be done.
randfact
logical. Should the order of the factors be randomized? This may improve convergence.
means
logical. Should the means instead of the demeaned matrix be returned? Setting means=TRUE will return mtx - demeanlist(mtx,...), but without the extra copy.

Value

  • If mtx is a matrix, a matrix of the same shape, possibly with column icpt deleted. If mtx is a list of vectors and matrices, a list of the same length is returned, with the same vector and matrix-pattern, but the matrices have the column icpt deleted.

Details

For each column y in mtx, the equivalent of the following centering is performed, with cy as the result. cy <- y; oldy <- y-1 while(sqrt(sum((cy-oldy)**2)) >= eps) { oldy <- cy for(f in fl) cy <- cy - ave(cy,f) }

Beginning with version 1.6, each factor in fl may contain an attribute 'x' which is a numeric vector of the same length as the factor. The centering is then not done on the means of each group, but on the projection onto the covariate in each group. That is, with a covariate x and a factor f, it is like projecting out the interaction x:f. The (x) can also be a matrix of column vectors, in this case it can be beneficial to orthogonalize the columns, either with a stabilized Gram-Schmidt method, or with the simple method x %*% solve(chol(crossprod(x))).

In some applications it is known that a single centering iteration is sufficient. In particular, if length(fl)==1 and there is no interaction attribute x. In this case the centering algorithm is terminated after the first iteration. There may be other cases, e.g. if there is a single factor with an x with orthogonal columns. If you have such prior knowledge, it is possible to force termination after the first iteration by adding an attribute attr(fl, 'oneiter') <- TRUE. Convergence will be reached in the second iteration anyway, but you save one iteration, i.e. you double the speed.

Examples

Run this code
oldopts <- options(lfe.threads=1)
## create a 15x3 matrix
mtx <- matrix(rnorm(45),15,3)

## a list of factors
fl <- list(g1=factor(sample(2,nrow(mtx),replace=TRUE)),
           g2=factor(sample(3,nrow(mtx),replace=TRUE)))

## centre on both means and print result
mtx0 <- demeanlist(mtx,fl)
cbind(mtx0,g1=fl[[1]],g2=fl[[2]],comp=compfactor(fl))

for(i in 1:ncol(mtx0))
   for(n in names(fl))
    cat('col',i,'group',n,'level sums:',tapply(mtx0[,i],fl[[n]],mean),'<n>')

options(oldopts)</n>

Run the code above in your browser using DataLab