Artificial Data Set generated by Hawkins, Bradu, and Kass (1984). The data set consists of 75 observations in four dimensions (one response and three explanatory variables). It provides a good example of the masking effect. The first 14 observations are outliers, created in two groups: 1--10 and 11--14. Only observations 12, 13 and 14 appear as outliers when using classical methods, but can be easily unmasked using robust distances computed by, e.g., MCD - covMcd().
data(hbk, package="robustbase")
A data frame with 75 observations on 4 variables, where the last variable is the dependent one.
x[,1]
x[,2]
x[,3]
y
data(hbk)
plot(hbk)
summary(lm.hbk <- lm(Y ~ ., data = hbk))
hbk.x <- data.matrix(hbk[, 1:3])
(cHBK <- covMcd(hbk.x))
Run the code above in your browser using DataLab