AIC.seqModel: Information criteria for a sequence of regression models

Description

Compute the Akaike or Bayes information criterion for for a sequence of regression models, such as submodels along a robust least angle regression sequence, or sparse least trimmed squares regression models for a grid of values for the penalty parameter.

Usage

# S3 method for seqModel
AIC(object, ..., k = 2)
# S3 method for sparseLTS
AIC(object, ..., fit = c("reweighted", "raw",
  "both"), k = 2)
# S3 method for seqModel
BIC(object, ...)
# S3 method for sparseLTS
BIC(object, ...)

Arguments

object

the model fit for which to compute the information criterion.

…

for the BIC method, additional arguments to be passed down to the AIC method. For the AIC method, additional arguments are currently ignored.

a numeric value giving the penalty per parameter to be used. The default is to use \(2\) as in the classical definition of the AIC.

fit

a character string specifying for which fit to compute the information criterion. Possible values are "reweighted" (the default) for the information criterion of the reweighted fit, "raw" for the information criterion of the raw fit, or "both" for the information criteria of both fits.

Value

A numeric vector or matrix giving the information criteria for the requested model fits.

Details

The information criteria are computed as \(n (\log(2 \pi) + 1 + \log(\hat{\sigma}^2)) + df k\), where \(n\) denotes the number of observations, \(\hat{\sigma}\) is the robust residual scale estimate, \(df\) is the number of nonzero coefficient estimates, and \(k\) is penalty per parameter. The usual definition of the AIC uses \(k = 2\), whereas the BIC uses \(k = \log(n)\). Consequently, the former is used as the default penalty of the AIC method, whereas the BIC method calls the AIC method with the latter penalty.

References

Akaike, H. (1970) Statistical predictor identification. Annals of the Institute of Statistical Mathematics, 22(2), 203--217.

Schwarz, G. (1978) Estimating the dimension of a model. The Annals of Statistics, 6(2), 461--464.

Examples

Run this code

# NOT RUN {
## generate data
# example is not high-dimensional to keep computation time low
library("mvtnorm")
set.seed(1234)  # for reproducibility
n <- 100  # number of observations
p <- 25   # number of variables
beta <- rep.int(c(1, 0), c(5, p-5))  # coefficients
sigma <- 0.5      # controls signal-to-noise ratio
epsilon <- 0.1    # contamination level
Sigma <- 0.5^t(sapply(1:p, function(i, j) abs(i-j), 1:p))
x <- rmvnorm(n, sigma=Sigma)    # predictor matrix
e <- rnorm(n)                   # error terms
i <- 1:ceiling(epsilon*n)       # observations to be contaminated
e[i] <- e[i] + 5                # vertical outliers
y <- c(x %*% beta + sigma * e)  # response
x[i,] <- x[i,] + 5              # bad leverage points


## robust LARS
# fit model
fitRlars <- rlars(x, y, sMax = 10)
# compute AIC and BIC
AIC(fitRlars)
BIC(fitRlars)


## fit sparse LTS model over a grid of values for lambda
frac <- seq(0.2, 0.05, by = -0.05)
fitSparseLTS <- sparseLTS(x, y, lambda = frac, mode = "fraction")
# compute AIC and BIC
AIC(fitSparseLTS)
BIC(fitSparseLTS)
# }

Run the code above in your browser using DataLab