
Computes the orthogonalized pairwise covariance matrix estimate described in in Maronna and Zamar (2002). The pairwise proposal goes back to Gnanadesikan and Kettenring (1972).
covOGK(X, n.iter = 2, sigmamu, rcov = covGK, weight.fn = hard.rejection,
keep.data = FALSE, …)covGK (x, y, scalefn = scaleTau2, …)
s_mad(x, mu.too = FALSE, na.rm = FALSE)
s_IQR(x, mu.too = FALSE, na.rm = FALSE)
data in something that can be coerced into a numeric matrix.
number of orthogonalization iterations. Usually 1 or 2; values greater than 2 are unlikely to have any significant effect on the estimate (other than increasing the computing time).
a function that computes univariate robust
location and scale estimates. By default it should return a single
numeric value containing the robust scale (standard deviation)
estimate. When mu.too
is true, sigmamu()
should
return a numeric vector of length 2 containing robust location and
scale estimates. See scaleTau2
, s_Qn
,
s_Sn
, s_mad
or s_IQR
for examples to be
used as sigmamu
argument.
function that computes a robust covariance estimate
between two vectors. The default, Gnanadesikan-Kettenring's
covGK
, is simply sigmamu()
.
a function of the robust distances and the number of
variables
logical indicating if the (untransformed) data matrix
X
should be kept as part of the result.
additional arguments; for covOGK
to be passed to
sigmamu()
and weight.fn()
; for covGK
passed to scalefn
.
numeric vectors of the same length, the covariance of which
is sought in covGK
(or the scale, in s_mad
or
s_IQR
).
logical indicating if both location and scale should be
returned or just the scale (when mu.too=FALSE
as by default).
if TRUE
then NA
values are stripped
from x
before computation takes place.
covOGK()
currently returns a list with components
robust location: numeric vector of length
robust covariance matrix estimate:
re-weighted versions of center
and
cov
.
the robustness weights used.
the mahalanobis distances computed using
center
and cov
.
covGK() is a trivial 1-line function returning the covariance estimate \hat c(x,y) = \left(\hat \sigma(x+y)^2 - \hat \sigma(x-y)^2 \right)/4,% c^(x,y) = [s^(x+y)^2 - s^(x-y)^2]/4,% where \hat \sigma(u)s^(u) is the scale estimate of u specified by scalefn.
s_mad(), and s_IQR() return the scale estimates mad or IQR respectively, where the s_* functions return a length-2 vector (mu, sig) when mu.too = TRUE, see also scaleTau2.
Typical default values for the function arguments
sigmamu
, rcov
, and weight.fn
, are
available as well, see the Examples below,
but their names and calling sequences are
still subject to discussion and may be changed in the future.
The current default, weight.fn = hard.rejection
corresponds to
the proposition in the litterature, but Martin Maechler strongly
believes that the hard threshold currently in use is too arbitrary,
and further that soft thresholding should be used instead, anyway.
Maronna, R.A. and Zamar, R.H. (2002) Robust estimates of location and dispersion of high-dimensional datasets; Technometrics 44(4), 307--317.
Gnanadesikan, R. and John R. Kettenring (1972) Robust estimates, residuals, and outlier detection with multiresponse data. Biometrics 28, 81--124.
# NOT RUN {
data(hbk)
hbk.x <- data.matrix(hbk[, 1:3])
cO1 <- covOGK(hbk.x, sigmamu = scaleTau2)
cO2 <- covOGK(hbk.x, sigmamu = s_Qn)
cO3 <- covOGK(hbk.x, sigmamu = s_Sn)
cO4 <- covOGK(hbk.x, sigmamu = s_mad)
cO5 <- covOGK(hbk.x, sigmamu = s_IQR)
# }
# NOT RUN {
<!-- %% FIXME: Add time comparison, here or in "vignette", "demo", "... -->
# }
# NOT RUN {
data(toxicity)
cO1tox <- covOGK(toxicity, sigmamu = scaleTau2)
cO2tox <- covOGK(toxicity, sigmamu = s_Qn)
## nice formatting of correlation matrices:
as.dist(round(cov2cor(cO1tox$cov), 2))
as.dist(round(cov2cor(cO2tox$cov), 2))
## "graphical"
symnum(cov2cor(cO1tox$cov))
symnum(cov2cor(cO2tox$cov), legend=FALSE)
# }
Run the code above in your browser using DataLab