Learn R Programming

robustbase (version 0.9-8)

lmrob.control: Tuning Parameters for lmrob() and Auxiliaries

Description

Tuning parameters for lmrob, the MM-type regression estimator and the associated S-, M- and D-estimators. Using setting="KS2011" sets the defaults as suggested by Koller and Stahel (2011).

Usage

lmrob.control(setting, seed = NULL, nResample = 500,
              tuning.chi = NULL, bb = 0.5, tuning.psi = NULL,
              max.it = 50, groups = 5, n.group = 400,
              k.fast.s = 1, best.r.s = 2,
              k.max = 200, maxit.scale = 200, k.m_s = 20,
              refine.tol = 1e-7, rel.tol = 1e-7, solve.tol = 1e-7,
              trace.lev = 0,
              mts = 1000, subsampling = c("nonsingular", "simple"),
              compute.rd = FALSE, method = 'MM',
              psi = c('bisquare', 'lqq', 'welsh', 'optimal', 'hampel', 'ggw'),
              numpoints = 10, cov = NULL,
              split.type = c("f", "fi", "fii"), fast.s.large.n = 2000, ...)

Arguments

setting
a string specifying alternative default values. Leave empty for the defaults or use "KS2011" for the defaults suggested by Koller and Stahel (2011). See Details.
seed
NULL or an integer vector compatible with .Random.seed: the seed to be used for random re-sampling used in obtaining candidates for the initial S-estimator. The current value
nResample
number of re-sampling candidates to be used to find the initial S-estimator. Currently defaults to 500 which works well in most situations (see references).
tuning.chi
tuning constant vector for the S-estimator. If NULL, as by default, sensible defaults are set (depending on psi) to yield a 50% breakdown estimator. See Details.
bb
expected value under the normal model of the chi (rather $\rho (rho)$) function with tuning constant equal to tuning.chi. This is used to compute the S-estimator.
tuning.psi
tuning constant vector for the redescending M-estimator. If NULL, as by default, this is set (depending on psi) to yield an estimator with asymptotic efficiency of 95% for normal errors. See Details.
max.it
integer specifying the maximum number of IRWLS iterations.
groups
(for the fast-S algorithm): Number of random subsets to use when the data set is large.
n.group
(for the fast-S algorithm): Size of each of the groups above. Note that this must be at least $p$.
k.fast.s
(for the fast-S algorithm): Number of local improvement steps (I-steps) for each re-sampling candidate.
k.m_s
(for the M-S algorithm): specifies after how many unsucessful refinement steps the algorithm stops.
best.r.s
(for the fast-S algorithm): Number of of best candidates to be iterated further (i.e., refined); is denoted $t$ in Salibian-Barrera & Yohai(2006).
k.max
(for the fast-S algorithm): maximal number of refinement steps for the fully iterated best candidates.
maxit.scale
integer specifying the maximum number of C level find_scale() iterations.
refine.tol
(for the fast-S algorithm): relative convergence tolerance for the fully iterated best candidates.
rel.tol
(for the RWLS iterations of the MM algorithm): relative convergence tolerance for the parameter vector.
solve.tol
(for the S algorithm): relative tolerance for inversion. Hence, this corresponds to solve.default()'s tol.
trace.lev
integer indicating if the progress of the MM-algorithm should be traced (increasingly); default trace.lev = 0 does no tracing.
mts
maximum number of samples to try in subsampling algorithm.
subsampling
type of subsampling to be used, simple for simple subsampling (default prior to version 0.9), nonsingular for nonsingular subsampling. See lmrob.S.
compute.rd
logical indicating if robust distances (based on the MCD robust covariance estimator covMcd) are to be computed for the robust diagnostic plots. This may take some time to finish, particularly f
method
string specifying the estimator-chain. MM is interpreted as SM. See Details of lmrob for a description of the possible values.
psi
string specifying the type $\psi$-function used. See Details of lmrob. Defaults to "bisquare" for S and MM-estimates, otherwise "lqq".
numpoints
number of points used in Gauss quadrature.
cov
function or string with function name to be used to calculate covariance matrix estimate. The default is if(method %in% c('SM', 'MM')) ".vcov.avar1" else ".vcov.w". See Details of lmro
split.type
determines how categorical and continuous variables are split. See splitFrame.
fast.s.large.n
minimum number of observations required to switch from ordinary fast S algorithm to an efficient large n strategy.
...
further arguments to be added as list components to the result.

Value

  • a named list with over twenty components, corresponding to the arguments, where tuning.psi and tuning.chi are typically computed, see above.

encoding

utf8

Details

The option setting="KS2011" alters the default arguments. They are changed to method = 'SMDM', psi = 'lqq', max.it = 500, k.max = 2000, cov = '.vcov.w'. The defaults of all the remaining arguments are not changed.

By default, tuning.chi and tuning.psi are set to yield an MM-estimate with break-down point $0.5$ and efficiency of $95%$ at the normal. They are: rll{ psi tuning.chi tuning.psi bisquare 1.54764 4.685061 welsh 0.5773502 2.11 ggw c(-0.5, 1.5, NA, 0.5) c(-0.5, 1.5, 0.95, NA) lqq c(-0.5, 1.5, NA, 0.5) c(-0.5, 1.5, 0.95, NA) optimal 0.4047 1.060158 hampel c(1.5, 3.5, 8)*0.2119163 c(1.5, 3.5, 8)*0.9014 } The values for the tuning constant for the ggw psi function are hard coded. The constants vector has four elements: minimal slope, b (controlling the bend at the maximum of the curve), efficiency, break-down point. Use NA for an unspecified value, see examples in the tables.

The constants for the hampel psi function are chosen to have a redescending slope of $-1/3$. Constants for a slope of $-1/2$ would be rll{ psi tuning.chi tuning.psi hampel c(2, 4, 8) * 0.1981319 c(2, 4, 8) * 0.690794 }

Alternative coefficients for an efficiency of $85%$ at the normal are given in the table below. rl{ psi tuning.psi bisquare 3.443689 welsh 1.456 ggw, lqq c(-0.5, 1.5, 0.85, NA) optimal 0.8684 hampel (-1/3) c(1.5, 3.5, 8)* 0.5704545 hampel (-1/2) c( 2, 4, 8) * 0.4769578 }

References

Koller, M. and Stahel, W.A. (2011), Sharpening Wald-type inference in robust regression for small samples, Computational Statistics & Data Analysis 55(8), 2504--2515.

See Also

lmrob, also for references and examples.

Examples

Run this code
## Show the default settings:
str(lmrob.control())

## Artificial data for a  simple  "robust t test":
set.seed(17)
y <- y0 <- rnorm(200)
y[sample(200,20)] <- 100*rnorm(20)
gr <- as.factor(rbinom(200, 1, prob = 1/8))
lmrob(y0 ~ 0+gr)

## Use  Koller & Stahel(2011)'s recommendation but a larger  'max.it':
str(ctrl <- lmrob.control("KS2011", max.it = 1000))

Run the code above in your browser using DataLab