Learn R Programming

sprm (version 1.2.2)

sprms: Sparse partial robust M regression

Description

Sparse partial robust M regression for models with univariate response. This method for dimension reduction and regression analysis yields estimates with a partial least squares alike interpretability that are both sparse and robust to both vertical outliers and leverage points. The sparsity is tuned with an L1 penalty.

Usage

sprms(formula, data, a, eta, fun = "Hampel", probp1 = 0.95, hampelp2 = 0.975,
hampelp3 = 0.999, center = "median", scale = "qn", print = FALSE, 
numit = 100, prec = 0.01)

Arguments

formula
an object of class formula.
data
a data frame which contains the variables given in formula or a list of two elements, where the first element is the response vector and the second element is a matrix of the explanatory variables.
a
the number of SPRMS components to be estimated in the model.
eta
a tuning parameter for the sparsity with 0\le eta
fun
an internal weighting function for case weights. Choices are "Hampel" (preferred), "Huber" or "Fair".
probp1
the 1-alpha value at which to set the first outlier cutoff for the weighting function.
hampelp2
the 1-alpha values for second cutoff. Only applies to fun="Hampel".
hampelp3
the 1-alpha values for third cutoff. Only applies to fun="Hampel".
center
type of centering of the data in form of a string that matches an R function, e.g. "mean" or "median".
scale
type of scaling for the data in form of a string that matches an R function, e.g. "sd" or "qn" or alternatively "no" for no scaling.
print
logical, default is FALSE. If TRUE the variables included in each component are reported.
numit
the maximum number of iterations for the convergence of the coefficient estimates.
prec
a value for the precision of estimation of the coefficients.

Value

  • sprms returns an object of class sprm.

    Functions summary, predict and plot are available. Also the generic functions coefficients, fitted.values and residuals can be used to extract the corresponding elements from the sprm object.

  • coefficientsvector of coefficients of the weighted regression model.
  • interceptintercept of weighted regression model.
  • wythe case weights in the y space.
  • wtthe case weights in the score space.
  • wthe overall case weights used for weighted regression (depending on the weight function). w=wy*wt.
  • scoresthe matrix of scores.
  • RDirection vectors (or weighting vectors or rotation matrix) to obtain the scores. scores=Xs%*%R.
  • loadingsthe matrix of loadings.
  • fitted.valuesthe vector of estimated response values.
  • residualsvector of residuals, true response minus estimated response.
  • coefficients.scaledvector of coefficients of the weighted regression model with scaled data.
  • intercept.scaledintercept of weighted regression model with scaled data.
  • YMeansvalue used internally to center response.
  • XMeanvector used internally to center data.
  • Xscalesvector used internally to scale data.
  • Yscalesvalue used internally to scale response.
  • Yvarpercentage of contribution for each component to the explanation of the variance of the response.
  • Xvarpercentage of contribution for each component to the explanation of the variance of the variables.
  • inputslist of inputs: parameters, data and scaled data.
  • used.varsIndices of variables included in the model.

Details

The NIPLS algorithm with a L1 sparsity constrained combined with weighted regression is used for the model estimation.

a is the number of components in the model. Note that it is not possible to simply reduce the number of weighting vectors to obtain a model with a smaller number of components. Each model has to be estimated separately due to its dependence on robust case weights.

References

Hoffmann, I., Serneels, S., Filzmoser, P., Croux, C. (2015). Sparse partial robust M regression. Chemometrics and Intelligent Laboratory Systems, 149, 50-59.

Serneels, S., Croux, C., Filzmoser, P., Van Espen, P.J. (2005). Partial Robust M-Regression. Chemometrics and Intelligent Laboratory Systems, 79, 55-64.

See Also

sprmsCV, plot.sprm, biplot.sprm, predict.sprm, prms

Examples

Run this code
set.seed(50235)
U1 <- c(rep(3,20), rep(4,30))
U2 <- rep(3.5,50)
X1 <- replicate(5, U1+rnorm(50))
X2 <- replicate(20, U2+rnorm(50))
X <- cbind(X1,X2)
beta <- c(rep(1, 5), rep(0,20))
e <- c(rnorm(45,0,1.5),rnorm(5,-20,1))
y <- X%*%beta + e
d <- as.data.frame(X)
d$y <- y
mod <- sprms(y~., data=d, a=1, eta=0.5, fun="Hampel")
sprmfit <- predict(mod)

plot(y,sprmfit, main="SPRM")
abline(0,1)

Run the code above in your browser using DataLab