lmrobdet.control: Tuning parameters for lmrobdetMM and lmrobdetDCML

Description

This function sets tuning parameters for the MM estimator implemented in lmrobdetMM and the Distance Constrained Maximum Likelihood regression estimators computed by lmrobdetDCML.

Usage

lmrobdet.control(
  bb = 0.5,
  efficiency = 0.95,
  family = "mopt",
  tuning.psi,
  tuning.chi,
  compute.rd = FALSE,
  corr.b = TRUE,
  split.type = "f",
  initial = "S",
  max.it = 100,
  refine.tol = 1e-07,
  rel.tol = 1e-07,
  refine.S.py = 1e-07,
  refine.PY = 10,
  solve.tol = 1e-07,
  trace.lev = 0,
  psc_keep = 0.5,
  resid_keep_method = "threshold",
  resid_keep_thresh = 2,
  resid_keep_prop = 0.2,
  py_maxit = 20,
  py_eps = 1e-05,
  mscale_maxit = 50,
  mscale_tol = 1e-06,
  mscale_rho_fun = "bisquare"
)

Value

A list with the necessary tuning parameters.

Arguments

bb: tuning constant (between 0 and 1/2) for the M-scale used to compute the initial S-estimator. It determines the robusness (breakdown point) of the resulting MM-estimator, which is bb. Defaults to 0.5.
efficiency: desired asymptotic efficiency of the final regression M-estimator. Defaults to 0.95.
family: string specifying the name of the family of loss function to be used (current valid options are "bisquare", "opt" and "mopt"). Incomplete entries will be matched to the current valid options. Defaults to "mopt".
tuning.psi: tuning parameters for the regression M-estimator computed with a rho function as specified with argument family. If missing, it is computed inside lmrobdet.control to match the value of efficiency according to the family of rho functions specified in family. Appropriate values for tuning.psi for a given desired efficiency for Gaussian errors can be constructed using the functions bisquare, mopt and opt.
tuning.chi: tuning constant for the function used to compute the M-scale used for the initial S-estimator. If missing, it is computed inside lmrobdet.control to match the value of bb according to the family of rho functions specified in family.
compute.rd: logical value indicating whether robust leverage distances need to be computed.
corr.b: logical value indicating whether a finite-sample correction should be applied to the M-scale parameter bb.
split.type: determines how categorical and continuous variables are split. See splitFrame.
initial: string specifying the initial value for the M-step of the MM-estimator. Valid options are 'S', for an S-estimator and 'MS' for an M-S estimator which is appropriate when there are categorical explanatory variables in the model.
max.it: maximum number of IRWLS iterations for the MM-estimator
refine.tol: relative convergence tolerance for the S-estimator
rel.tol: relative convergence tolerance for the IRWLS iterations for the MM-estimator
refine.S.py: relative convergence tolerance for the local improvements of the Pena-Yohai candidates for the S-estimator
refine.PY: number of refinement steps for the Pen~a-Yohai candidates
solve.tol: (for the S algorithm): relative tolerance for matrix inversion. Hence, this corresponds to solve.default's tol.
trace.lev: positive values (increasingly) provide details on the progress of the MM-algorithm
psc_keep: For pyinit, proportion of observations to remove based on PSCs. The effective proportion of removed observations is adjusted according to the sample size to be prosac*(1-p/n). See pyinit.
resid_keep_method: For pyinit, how to clean the data based on large residuals. If "threshold", all observations with scaled residuals larger than C.res will be removed, if "proportion", observations with the largest prop residuals will be removed. See pyinit.
resid_keep_thresh: See parameter resid_keep_method above. See pyinit.
resid_keep_prop: See parameter resid_keep_method above. See pyinit.
py_maxit: Maximum number of iterations. See pyinit.
py_eps: Relative tolerance for convergence. See pyinit.
mscale_maxit: Maximum number of iterations for the M-scale algorithm. See pyinit and scaleM.
mscale_tol: Convergence tolerance for the M-scale algorithm. See scaleM.
mscale_rho_fun: String indicating the loss function used for the M-scale. See pyinit.

Choice of Rho Loss Function

As of RobStatTM Versopm 1.0.7, the opt and mopt rhos functions are calculated using polynomials, rather than using the standard normal error function (erf) as in versions of RobStatTM prior to 1.0.7. The numerical results one now gets with the opt or mopt choices will differ by small amounts from those in earlier RobStatTM versions. Users who wish to replicate results from releases prior to 1.0.7 may do so using the family arguments family = "optV0" or family = "moptV0". Note that the derivative of the rho loss function, known as the "psi" function, is not the derivative of the rho polynomial,instead it is still the analytic optimal psi function whose formula is given in the second of the Vignettes referenced just below.

Related Vignettes

For further details, see the Vignettes "Polynomial Opt and mOpt Rho Functions", and "Optimal Bias Robust Regression Psi and Rho".

Author

Matias Salibian-Barrera, matias@stat.ubc.ca

Details

The argument family specifies the name of the family of loss function to be used. Current valid options are "bisquare", "opt", "mopt", "optV0" and "moptV0". "mopt" is a modified version of the optimal psi function to make it strictly increasing close to 0, and to make the corresponding weight function non-increasing.

Examples

Run this code

data(coleman, package='robustbase')
m2 <- lmrobdetMM(Y ~ ., data=coleman, control=lmrobdet.control(refine.PY=50))
m2
summary(m2)

Run the code above in your browser using DataLab