fortify.seqModel: Convert a sequence of regression models into a data frame for plotting

Description

Supplement the fitted values and residuals of a sequence of regression models (such as robust least angle regression models or sparse least trimmed squares regression models) with other useful information for diagnostic plots.

Usage

## S3 method for class 'seqModel':
fortify(model, data, s = NA, ...)

  ## S3 method for class 'sparseLTS':
fortify(model, data, s = NA,
    fit = c("reweighted", "raw", "both"), ...)

Arguments

model

the model fit to be converted.

data

currently ignored.

for the "seqModel" method, an integer vector giving the steps of the submodels to be converted (the default is to use the optimal submodel). For the "sparseLTS" method, an integer vector giving the indices of the mod

fit

a character string specifying which fit to convert. Possible values are "reweighted" (the default) to convert the reweighted fit, "raw" to convert the raw fit, or "both" to convert both fits.

...

currently ignored.

Value

A data frame containing the columns listed below, as well as additional information stored in the attributes "qqLine" (intercepts and slopes of the respective reference lines to be displayed in residual Q-Q plots), "q" (quantiles of the Mahalanobis distribution used as cutoff points for detecting leverage points), and "facets" (default faceting formula for the diagnostic plots).
stepthe steps (for the "seqModel" method) or indices (for the "sparseLTS" method) of the models (only returned if more than one model is requested).
fitthe model fits (only returned if both the reweighted and raw fit are requested in the "sparseLTS" method).
indexthe indices of the observations.
fittedthe fitted values.
residualthe standardized residuals.
theoreticalthe corresponding theoretical quantiles from the standard normal distribution.
qqdthe absolute distances from a reference line through the first and third sample and theoretical quartiles.
rdthe robust Mahalanobis distances computed via the MCD (see covMcd).
xydthe pairwise maxima of the absolute values of the standardized residuals and the robust Mahalanobis distances, divided by the respective other outlier detection cutoff point.
weightthe weights indicating regression outliers.
leveragelogicals indicating leverage points (i.e., outliers in the predictor space).
classificationa factor with levels "outlier" (regression outliers) and "good" (data points following the model).

Examples

Run this code

## generate data
# example is not high-dimensional to keep computation time low
library("mvtnorm")
set.seed(1234)  # for reproducibility
n <- 100  # number of observations
p <- 25   # number of variables
beta <- rep.int(c(1, 0), c(5, p-5))  # coefficients
sigma <- 0.5      # controls signal-to-noise ratio
epsilon <- 0.1    # contamination level
Sigma <- 0.5^t(sapply(1:p, function(i, j) abs(i-j), 1:p))
x <- rmvnorm(n, sigma=Sigma)    # predictor matrix
e <- rnorm(n)                   # error terms
i <- 1:ceiling(epsilon*n)       # observations to be contaminated
e[i] <- e[i] + 5                # vertical outliers
y <- c(x %*% beta + sigma * e)  # response
x[i,] <- x[i,] + 5              # bad leverage points


## robust LARS
# fit model
fitRlars <- rlars(x, y, sMax = 10)
# convert to data for plotting
head(fortify(fitRlars))


## sparse LTS
# fit model
fitSparseLTS <- sparseLTS(x, y, lambda = 0.05, mode = "fraction")
# convert to data for plotting
head(fortify(fitSparseLTS))
head(fortify(fitSparseLTS, fit = "both"))

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples