Learn R Programming

sprm (version 1.1)

prmsCV: Cross validation method for prm models.

Description

k-fold cross validation for the selection of the number of components for partial robust M regression.

Usage

prmsCV(formula, data, as, nfold = 10, fun = "Hampel", probp1 = 0.95, hampelp2 = 0.975,
hampelp3 = 0.999, center = "median", scale = "qn", usesvd = FALSE, plot = TRUE, 
numit = 100, prec = 0.01, alpha = 0.15)

Arguments

formula
an object of class formula.
data
a data frame or list which contains the variables given in formula.
as
a vector with positive integers, which are the number of PRMS components to be estimated in the models.
nfold
the number of folds used for cross validation, default is nford=10 for 10-fold CV.
fun
an internal weighting function for case weights. Choices are "Hampel" (preferred), "Huber" or "Fair".
probp1
the 1-alpha value at which to set the first outlier cutoff for the weighting function.
hampelp2
the 1-alpha values for second cutoff. Only applies to fun="Hampel".
hampelp3
the 1-alpha values for third cutoff. Only applies to fun="Hampel".
center
type of centering of the data in form of a string that matches an R function, e.g. "mean" or "median".
scale
type of scaling for the data in form of a string that matches an R function, e.g. "sd" or "qn" or alternatively "no" for no scaling.
usesvd
logical, default is FALSE. If TRUE singular value decomposition is performed.
plot
logical, default is TRUE. If TRUE a plot is generated with a measure of the prediction accuracy for each model (see Details).
numit
the number of maximal iterations for the convergence of the coefficient estimates.
prec
a value for the precision of estimation of the coefficients.
alpha
value used for alpha trimmed mean squared error, which is the cross validation criterion (see Details).

Value

  • opt.modobject of class prm. (see prms)
  • spematrix with squared prediction error for each observation and each number of components.

Details

The alpha - trimmed mean squared error of the predicted response over all observations is used as robust decision criterion to choose the optimal model. For plot=TRUE a graphic visualizes the alpha - trimmed mean squared error for each model.

References

Sven Serneels et al. (2014) Sparse partial robust M regression

Serneels, S., Croux, C., Filzmoser, P., Van Espen, P.J., Partial Robust M-Regression. Chemometrics and Intelligent Laboratory Systems, 79 (2005), 55-64.

See Also

prms, plot.prm, predict.prm, sprmsCV

Examples

Run this code
set.seed(5023)
U <- c(rep(2,20), rep(5,30))
X <- replicate(6, U+rnorm(50))
beta <- c(rep(1, 3), rep(-1,3))
e <- c(rnorm(45,0,1.5),rnorm(5,-20,1))
y <- X%*%beta + e
d <- as.data.frame(X)
d$y <- y
res <- prmsCV(y~., data=d, as=2:4, plot=TRUE, prec=0.05)
summary(res$opt.mod)

Run the code above in your browser using DataLab