MASS (version 7.3-22)

rlm: Robust Fitting of Linear Models

Description

Fit a linear model by robust regression using an M estimator.

Usage

rlm(x, ...)

## S3 method for class 'formula': rlm(formula, data, weights, \dots, subset, na.action, method = c("M", "MM", "model.frame"), wt.method = c("inv.var", "case"), model = TRUE, x.ret = TRUE, y.ret = FALSE, contrasts = NULL)

## S3 method for class 'default': rlm(x, y, weights, \dots, w = rep(1, nrow(x)), init = "ls", psi = psi.huber, scale.est = c("MAD", "Huber", "proposal 2"), k2 = 1.345, method = c("M", "MM"), wt.method = c("inv.var", "case"), maxit = 20, acc = 1e-4, test.vec = "resid", lqs.control = NULL)

psi.huber(u, k = 1.345, deriv = 0) psi.hampel(u, a = 2, b = 4, c = 8, deriv = 0) psi.bisquare(u, c = 4.685, deriv = 0)

Arguments

formula
a formula of the form y ~ x1 + x2 + ....
data
data frame from which variables specified in formula are preferentially to be taken.
weights
a vector of prior weights for each case.
subset
An index vector specifying the cases to be used in fitting.
na.action
A function to specify the action to be taken if NAs are found. The factory-fresh default action in Ris na.omit, and can be changed by
x
a matrix or data frame containing the explanatory variables.
y
the response: a vector of length the number of rows of x.
method
currently either M-estimation or MM-estimation or (for the formula method only) find the model frame. MM-estimation is M-estimation with Tukey's biweight initialized by a specific S-estimator. See the Details se
wt.method
are the weights case weights (giving the relative importance of case, so a weight of 2 means there are two of these) or the inverse of the variances, so a weight of two means this error is half as variable?
model
should the model frame be returned in the object?
x.ret
should the model matrix be returned in the object?
y.ret
should the response be returned in the object?
contrasts
optional contrast specifications: se lm.
w
(optional) initial down-weighting for each case.
init
(optional) initial values for the coefficients OR a method to find initial values OR the result of a fit with a coef component. Known methods are "ls" (the default) for an initial least-squares fit using weights
psi
the psi function is specified by this argument. It must give (possibly by name) a function g(x, ..., deriv) that for deriv=0 returns psi(x)/x and for deriv=1 returns psi'(x). Tuning constants will be pa
scale.est
method of scale estimation: re-scaled MAD of the residuals (default) or Huber's proposal 2 (which can be selected by either "Huber" or "proposal 2").
k2
tuning constant used for Huber proposal 2 scale estimation.
maxit
the limit on the number of IWLS iterations.
acc
the accuracy for the stopping criterion.
test.vec
the stopping criterion is based on changes in this vector.
...
additional arguments to be passed to rlm.default or to the psi function.
lqs.control
An optional list of control values for lqs.
u
numeric vector of evaluation points.
k, a, b, c
tuning constants.
deriv
0 or 1: compute values of the psi function or of its first derivative.

Value

  • An object of class "rlm" inheriting from "lm". Note that the df.residual component is deliberately set to NA to avoid inappropriate estimation of the residual scale from the residual mean square by "lm" methods. The additional components not in an lm object are
  • sthe robust scale estimate used
  • wthe weights used in the IWLS process
  • psithe psi function with parameters substituted
  • convthe convergence criteria at each iteration
  • convergeddid the IWLS converge?
  • wresida working residual, weighted for "inv.var" weights only.

Details

Fitting is done by iterated re-weighted least squares (IWLS).

Psi functions are supplied for the Huber, Hampel and Tukey bisquare proposals as psi.huber, psi.hampel and psi.bisquare. Huber's corresponds to a convex optimization problem and gives a unique solution (up to collinearity). The other two will have multiple local minima, and a good starting point is desirable.

Selecting method = "MM" selects a specific set of options which ensures that the estimator has a high breakdown point. The initial set of coefficients and the final scale are selected by an S-estimator with k0 = 1.548; this gives (for $n \gg p$) breakdown point 0.5. The final estimator is an M-estimator with Tukey's biweight and fixed scale that will inherit this breakdown point provided c > k0; this is true for the default value of c that corresponds to 95% relative efficiency at the normal. Case weights are not supported for method = "MM".

References

P. J. Huber (1981) Robust Statistics. Wiley.

F. R. Hampel, E. M. Ronchetti, P. J. Rousseeuw and W. A. Stahel (1986) Robust Statistics: The Approach based on Influence Functions. Wiley.

A. Marazzi (1993) Algorithms, Routines and S Functions for Robust Statistics. Wadsworth & Brooks/Cole.

Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.

See Also

lm, lqs.

Examples

Run this code
summary(rlm(stack.loss ~ ., stackloss))
rlm(stack.loss ~ ., stackloss, psi = psi.hampel, init = "lts")
rlm(stack.loss ~ ., stackloss, psi = psi.bisquare)

Run the code above in your browser using DataCamp Workspace