prmsCV: Cross validation method for PRM regression models.

Description

k-fold cross validation for the selection of the number of components for partial robust M regression.

Usage

prmsCV(formula, data, as, nfold = 10, fun = "Hampel", probp1 = 0.95, hampelp2 = 0.975,
hampelp3 = 0.999, center = "median", scale = "qn", usesvd = FALSE, plot = TRUE, 
numit = 100, prec = 0.01, alpha = 0.15)

Arguments

formula

an object of class formula.

data

a data frame or list which contains the variables given in formula.

a vector with positive integers, which are the number of PRM components to be estimated in the models.

nfold

the number of folds used for cross validation, default is nford=10 for 10-fold CV.

fun

an internal weighting function for case weights. Choices are "Hampel" (preferred), "Huber" or "Fair".

probp1

the 1-alpha value at which to set the first outlier cutoff for the weighting function.

hampelp2

the 1-alpha values for second cutoff. Only applies to fun="Hampel".

hampelp3

the 1-alpha values for third cutoff. Only applies to fun="Hampel".

center

type of centering of the data in form of a string that matches an R function, e.g. "mean" or "median".

scale

type of scaling for the data in form of a string that matches an R function, e.g. "sd" or "qn" or alternatively "no" for no scaling.

usesvd

logical, default is FALSE. If TRUE singular value decomposition is performed.

plot

logical, default is TRUE. If TRUE a plot is generated with a measure of the prediction accuracy for each model (see Details).

numit

the number of maximal iterations for the convergence of the coefficient estimates.

prec

a value for the precision of estimation of the coefficients.

alpha

value used for alpha trimmed mean squared error, which is the cross validation criterion (see Details).

Value

opt.mod: object of class prm. (see prms)
spe: matrix with squared prediction error for each observation and each number of components.

Details

The alpha - trimmed mean squared error of the predicted response over all observations is used as robust decision criterion to choose the optimal model. For plot=TRUE a graphic visualizes the alpha - trimmed mean squared error for each model.

References

Hoffmann, I., Serneels, S., Filzmoser, P., Croux, C. (2015). Sparse partial robust M regression. Chemometrics and Intelligent Laboratory Systems, 149, 50-59.

Serneels, S., Croux, C., Filzmoser, P., Van Espen, P.J. (2005). Partial Robust M-Regression. Chemometrics and Intelligent Laboratory Systems, 79, 55-64.

Examples

Run this code

set.seed(5023)
U <- c(rep(2,20), rep(5,30))
X <- replicate(6, U+rnorm(50))
beta <- c(rep(1, 3), rep(-1,3))
e <- c(rnorm(45,0,1.5),rnorm(5,-20,1))
y <- X%*%beta + e
d <- as.data.frame(X)
d$y <- y
res <- prmsCV(y~., data=d, as=2:4, plot=TRUE, prec=0.05)
summary(res$opt.mod)

Run the code above in your browser using DataLab