Learn R Programming

reverseR (version 0.2)

pcomp: Calculates linear regression p-values from a variety of robust regression methods

Description

This function calculates p-values from a variety of methods, specifically:
1) standard linear model
2) standard linear model with highest p-influencer removed
3) robust regression with MM-estimators
4) Theil-Sen regression
5) least absolute deviations regression
6) quantile regression
7) weighted regression with isolation forest scores as inverse weights
8) bootstrap linear model, see bootLM
9) jackknife linear model, see jackLM

Usage

pcomp(x, y = NULL, R = 1000, alpha = 0.05, ...)

Value

A vector of p-values from the above mentioned ten methods, in that order.

Arguments

x

either a linear model of class lm or the regressions x-values.

y

the optional y-values.

R

the number of bootstrap resamples, see bootLM.

alpha

the \(\alpha\)-level for lmInfl.

...

further arguments to be passed to downstream methods.

Author

Andrej-Nikolai Spiess

Details

This function is meant to provide a swift overview on the sensitivity of the p-values to different (mostly robust) linear regression methods, which correlates to a large extent with the presence of influential / outlying data points, see 'Examples'.

References

Robust Regression and Outlier Detection.
Rousseeuw PJ & Leroy AM.
1ed (1987), Wiley (NJ, USA).

A rank-invariant method of linear and polynomial regression analysis.
Theil H.
I. Nederl. Akad. Wetensch. Proc, 53, 1950, 386-392.

Estimates of the regression coefficient based on Kendall's tau.
Sen PK.
J Am Stat Assoc, 63, 1968, 1379-1389.

Least absolute deviations estimation via the EM algorithm.
Phillips RF.
Statistics and Computing, 12, 2002, 281-285.

Quantile Regression.
Koenker R.
Cambridge University Press, Cambridge, New York (2005).

Isolation-based anomaly detection.
Liu FT, Ting KM, Zhou ZH.
ACM Transactions on Knowledge Discovery from Data, 6.1, 2012, 3.

Examples

Run this code
## Example with influencer
## => a few methods indicate significant 
## downward drop of the p-value
set.seed(123)
a <- 1:20
b <- 5 + 0.08 * a + rnorm(20, 0, 1)
pcomp(a, b) 

Run the code above in your browser using DataLab