quokar (version 0.1.0)

frame_distance_complex: Residual-robust distance plot of quantile regression model

Description

the standardized residuals from quantile regression against the robust MCD distance. This display is used to diagnose both vertical outlier and horizontal leverage points. Function frame_distance only work for linear quantile regression model. With non-linear model, use frame_distance_complex

Usage

frame_distance_complex(x, resid, tau)

Arguments

x

matrix, covariate of quantile regression model

resid

matrix, residuals of quantile regression models

tau

singular or vectors, quantile

Value

dataframe for residual-robust distance plot

Details

The generalized MCD algorithm based on the fast-MCD algorithm formulated by Rousseeuw and Van Driessen(1999), which is similar to the algorithm for least trimmed squares(LTS). The canonical Mahalanobis distance is defined as $$MD(x_i)=[(x_i-\bar{x})^{T}\bar{C}(X)^{-1}(x_i-\bar{x})]^{1/2}$$ where \(\bar{x}=\frac{1}{n}\sum_{i=1}^{n}x_i\) and \(\bar{C}(X)=\frac{1}{n-1}\sum_{i=1}^{n}(x_i-\bar{x})^{T}(x_i- \bar{x})\) are the empirical multivariate location and scatter, respectively. Here \(x_i=(x_{i1},...,x_{ip})^{T}\) exclueds the intercept. The relation between the Mahalanobis distance \(MD(x_i)\) and the hat matrix \(H=(h_{ij})=X(X^{T}X)^{-1}X^{T}\) is $$h_{ii}=\frac{1}{n-1}MD^{2}_{i}+\frac{1}{n}$$ The canonical robust distance is defined as $$RD(x_{i})=[(x_{i}-T(X))^{T}C(X)^{-1}(x_{i}-T(X))]^{1/2}$$ where \(T(X)\) and \(C(X)\) are the robust multivariate location and scatter, respectively, obtained by MCD. To achieve robustness, the MCD algorithm estimates the covariance of a multivariate data set mainly through as MCD \(h\)-point subset of data set. This subset has the smallest sample-covariance determinant among all the possible \(h\)-subsets. Accordingly, the breakdown value for the MCD algorithm equals \(\frac{(n-h)}{n}\). This means the MCD estimates is reliable, even if up to \(\frac{100(n-h)}{n}\) set are contaminated.