A typical medium sized environmental data set with hourly measurements
of
data(NOxEmissions, package="robustbase")
A data frame with 8088 observations on the following 4 variables.
julday
day number, a factor with levels 373
... 730
, typically with 24 hourly measurements.
LNOx
LNOxEm
sqrtWS
Square root of wind speed [m/s].
The original data set had more observations, but with missing values.
Here, all cases with missing values were omitted
(na.omit(.)
), and then only those were retained that
belonged to days with at least 20 (fully) observed hourly
measurements.
another NOx dataset, ambientNOxCH
.
data(NOxEmissions)
plot(LNOx ~ LNOxEm, data = NOxEmissions, cex = 0.25, col = "gray30")
if (FALSE) ## these take too much time --
## p = 340 ==> already Least Squares is not fast
(lmNOx <- lm(LNOx ~ . ,data = NOxEmissions))
plot(lmNOx) #-> indication of 1 outlier
M.NOx <- MASS::rlm(LNOx ~ . , data = NOxEmissions)
## M-estimation works
## whereas MM-estimation fails:
try(MM.NOx <- MASS::rlm(LNOx ~ . , data = NOxEmissions, method = "MM"))
## namely because S-estimation fails:
try(lts.NOx <- ltsReg(LNOx ~ . , data = NOxEmissions))
try(lmR.NOx <- lmrob (LNOx ~ . , data = NOxEmissions))
Run the code above in your browser using DataLab