fortify.seqModel: Convert a sequence of regression models into a data frame for plotting

Description

Supplement the fitted values and residuals of a sequence of regression models (such as robust least angle regression models or sparse least trimmed squares regression models) with other useful information for diagnostic plots.

Usage

# S3 method for seqModel
fortify(model, data, s = NA, covArgs = list(...),
  ...)
# S3 method for sparseLTS
fortify(model, data, s = NA, fit = c("reweighted",
  "raw", "both"), covArgs = list(...), ...)

Arguments

model

the model fit to be converted.

data

currently ignored.

for the "seqModel" method, an integer vector giving the steps of the submodels to be converted (the default is to use the optimal submodel). For the "sparseLTS" method, an integer vector giving the indices of the models to be converted (the default is to use the optimal model for each of the requested fits).

covArgs

a list of arguments to be passed to covMcd for computing robust Mahalanobis distances.

…

additional arguments to be passed to covMcd can be specified directly instead of via covArgs.

fit

a character string specifying which fit to convert. Possible values are "reweighted" (the default) to convert the reweighted fit, "raw" to convert the raw fit, or "both" to convert both fits.

Value

A data frame containing the columns listed below, as well as additional information stored in the attributes "qqLine" (intercepts and slopes of the respective reference lines to be displayed in residual Q-Q plots), "q" (quantiles of the Mahalanobis distribution used as cutoff points for detecting leverage points), and "facets" (default faceting formula for the diagnostic plots).

step: the steps (for the "seqModel" method) or indices (for the "sparseLTS" method) of the models (only returned if more than one model is requested).
fit: the model fits (only returned if both the reweighted and raw fit are requested in the "sparseLTS" method).
index: the indices of the observations.
fitted: the fitted values.
residual: the standardized residuals.
theoretical: the corresponding theoretical quantiles from the standard normal distribution.
qqd: the absolute distances from a reference line through the first and third sample and theoretical quartiles.
rd: the robust Mahalanobis distances computed via the MCD (see covMcd).
xyd: the pairwise maxima of the absolute values of the standardized residuals and the robust Mahalanobis distances, divided by the respective other outlier detection cutoff point.
weight: the weights indicating regression outliers.
leverage: logicals indicating leverage points (i.e., outliers in the predictor space).
classification: a factor with levels "outlier" (regression outliers) and "good" (data points following the model).

Examples

Run this code

# NOT RUN {
## generate data
# example is not high-dimensional to keep computation time low
library("mvtnorm")
set.seed(1234)  # for reproducibility
n <- 100  # number of observations
p <- 25   # number of variables
beta <- rep.int(c(1, 0), c(5, p-5))  # coefficients
sigma <- 0.5      # controls signal-to-noise ratio
epsilon <- 0.1    # contamination level
Sigma <- 0.5^t(sapply(1:p, function(i, j) abs(i-j), 1:p))
x <- rmvnorm(n, sigma=Sigma)    # predictor matrix
e <- rnorm(n)                   # error terms
i <- 1:ceiling(epsilon*n)       # observations to be contaminated
e[i] <- e[i] + 5                # vertical outliers
y <- c(x %*% beta + sigma * e)  # response
x[i,] <- x[i,] + 5              # bad leverage points


## robust LARS
# fit model
fitRlars <- rlars(x, y, sMax = 10)
# convert to data for plotting
head(fortify(fitRlars))


## sparse LTS
# fit model
fitSparseLTS <- sparseLTS(x, y, lambda = 0.05, mode = "fraction")
# convert to data for plotting
head(fortify(fitSparseLTS))
head(fortify(fitSparseLTS, fit = "both"))
# }

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples