Learn R Programming

complexlm (version 1.1.2)

cooks.distance.zlm: Cook's Distance for Complex Linear Models

Description

Calculates the Cook's distances (technically a divergence, i.e. distance squared) of a complex linear model. These serve as a measurement of how much each input data point had on the model.

Usage

# S3 method for zlm
cooks.distance(model, lever = zhatvalues(model), ...)

Value

A numeric vector. The elements are the Cook's distances of each data point in model.

Arguments

model

An object of class "lm" or "rlm". Can be complex or numeric.

lever

A list of leverage scores with the same length as model$residuals. By default zhatvalues is called on model.

...

Other parameters. Only used if model is numeric; in which case they are passed to stats::cooks.distance.

Details

Consider a linear model relating a response vector y to a predictor vector x, both of length n. Using the model and predictor vector we can calculate a vector of predicted values yh. y and yh are points in a n dimensional output space. If we drop the i-th element of x and y, then fit another model using the "dropped i" vectors, we can get another point in output space, yhi. The squared Euclidean distance between yh and yhi, divided by the rank of the model (p) times its mean squared error s^2, is the i-th Cook's distance.
D_i = (yh - yhi)^t (yh - yhi) / p s^2D_i = (yh - yhi)^t (yh - yhi) / p s^2
A more elegant way to calculate it, which this function uses, is with the influence scores, hii.
D_i = |r_i|^2 / p s^2 hii / (1 - hii)D_i = |r_i|^2 / p s^2 hii / (1 - hii)
Where \(r_i\) is the \(i\)-th residual, and \(^t\) is the conjugate transpose.

References

R. D. Cook, Influential Observations in Linear Regression, Journal of the American Statistical Association 74, 169 (1979).

See Also

Examples

Run this code
set.seed(4242)
n <- 8
slop <- complex(real = 4.23, imaginary = 2.323)
interc <- complex(real = 1.4, imaginary = 1.804)
e <- complex(real=rnorm(n)/6, imaginary=rnorm(n)/6)
xx <- complex(real= rnorm(n), imaginary= rnorm(n))
tframe <- data.frame(x = xx, y= slop*xx + interc + e)
fit <- lm(y ~ x, data = tframe, weights = rep(1,n))
cooks.distance(fit)

Run the code above in your browser using DataLab