rlm
Fit a linear model by robust regression using an M estimator.
Usage
rlm(x, …)# S3 method for formula
rlm(formula, data, weights, …, subset, na.action,
method = c("M", "MM", "model.frame"),
wt.method = c("inv.var", "case"),
model = TRUE, x.ret = TRUE, y.ret = FALSE, contrasts = NULL)
# S3 method for default
rlm(x, y, weights, …, w = rep(1, nrow(x)),
init = "ls", psi = psi.huber,
scale.est = c("MAD", "Huber", "proposal 2"), k2 = 1.345,
method = c("M", "MM"), wt.method = c("inv.var", "case"),
maxit = 20, acc = 1e4, test.vec = "resid", lqs.control = NULL)
psi.huber(u, k = 1.345, deriv = 0)
psi.hampel(u, a = 2, b = 4, c = 8, deriv = 0)
psi.bisquare(u, c = 4.685, deriv = 0)
Arguments
 formula

a formula of the form
y ~ x1 + x2 + …
.  data

data frame from which variables specified in
formula
are preferentially to be taken.  weights
 a vector of prior weights for each case.
 subset
 An index vector specifying the cases to be used in fitting.
 na.action

A function to specify the action to be taken if
NA
s are found. The ‘factoryfresh’ default action in R isna.omit
, and can be changed byoptions(na.action=)
.  x
 a matrix or data frame containing the explanatory variables.
 y

the response: a vector of length the number of rows of
x
.  method

currently either Mestimation or MMestimation or (for the
formula
method only) find the model frame. MMestimation is Mestimation with Tukey's biweight initialized by a specific Sestimator. See the ‘Details’ section.  wt.method
 are the weights case weights (giving the relative importance of case, so a weight of 2 means there are two of these) or the inverse of the variances, so a weight of two means this error is half as variable?
 model
 should the model frame be returned in the object?
 x.ret
 should the model matrix be returned in the object?
 y.ret
 should the response be returned in the object?
 contrasts

optional contrast specifications: see
lm
.  w
 (optional) initial downweighting for each case.
 init

(optional) initial values for the coefficients OR a method to find
initial values OR the result of a fit with a
coef
component. Known methods are"ls"
(the default) for an initial leastsquares fit using weightsw*weights
, and"lts"
for an unweighted leasttrimmed squares fit with 200 samples.  psi

the psi function is specified by this argument. It must give
(possibly by name) a function
g(x, …, deriv)
that forderiv=0
returns psi(x)/x and forderiv=1
returns psi'(x). Tuning constants will be passed in via…
.  scale.est

method of scale estimation: rescaled MAD of the residuals (default)
or Huber's proposal 2 (which can be selected by either
"Huber"
or"proposal 2"
).  k2
 tuning constant used for Huber proposal 2 scale estimation.
 maxit
 the limit on the number of IWLS iterations.
 acc
 the accuracy for the stopping criterion.
 test.vec
 the stopping criterion is based on changes in this vector.
 …

additional arguments to be passed to
rlm.default
or to thepsi
function.  lqs.control

An optional list of control values for
lqs
.  u
 numeric vector of evaluation points.
 k, a, b, c
 tuning constants.
 deriv

0
or1
: compute values of the psi function or of its first derivative.
Details
Fitting is done by iterated reweighted least squares (IWLS). Psi functions are supplied for the Huber, Hampel and Tukey bisquare
proposals as psi.huber
, psi.hampel
and
psi.bisquare
. Huber's corresponds to a convex optimization
problem and gives a unique solution (up to collinearity). The other
two will have multiple local minima, and a good starting point is
desirable. Selecting method = "MM"
selects a specific set of options which
ensures that the estimator has a high breakdown point. The initial set
of coefficients and the final scale are selected by an Sestimator
with k0 = 1.548
; this gives (for \(n \gg p\))
breakdown point 0.5.
The final estimator is an Mestimator with Tukey's biweight and fixed
scale that will inherit this breakdown point provided c > k0
;
this is true for the default value of c
that corresponds to
95% relative efficiency at the normal. Case weights are not
supported for method = "MM"
.
Value
An object of class "rlm"
inheriting from "lm"
.
Note that the df.residual
component is deliberately set to
NA
to avoid inappropriate estimation of the residual scale from
the residual mean square by "lm"
methods. The additional components not in an lm
object are
"inv.var"
weights only.
References
P. J. Huber (1981) Robust Statistics. Wiley. F. R. Hampel, E. M. Ronchetti, P. J. Rousseeuw and W. A. Stahel (1986) Robust Statistics: The Approach based on Influence Functions. Wiley. A. Marazzi (1993) Algorithms, Routines and S Functions for Robust Statistics. Wadsworth & Brooks/Cole. Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
See Also
Examples
library(MASS)
summary(rlm(stack.loss ~ ., stackloss))
rlm(stack.loss ~ ., stackloss, psi = psi.hampel, init = "lts")
rlm(stack.loss ~ ., stackloss, psi = psi.bisquare)