outlierclassifier: Detect and classify compositional outliers.

Description

Detects outliers and classifies them according to different possible explanations.

Usage

OutlierClassifier1(X,...)
## S3 method for class 'acomp':
OutlierClassifier1(X,...,alpha=0.05,type=c("best","all","type","outlier","grade"),goodOnly=NULL,corrected=TRUE,RedCorrected=FALSE,robust=TRUE)

Arguments

the dataset as an acomp object

...

further arguments to MahalanobisDist/gsi.mahOutlier

alpha

The confidence level for identifying outliers.

type

What type of classification should be used: best: Which single component would best explain the outlier. all: Give a binary coding specifying all components, which could explain the outlier. type: Is it a a normal observation "ok",

goodOnly

an integer vector. Only the specified index of the dataset should be used for estimation of the outlier criteria. This parameter if only a small portion of the dataset is reliable.

corrected

logical. Literatur often proposed to compare the Mahalanobis distances with Chisq-Approximations of there distributions. However this does not correct for multiple testing. If corrected is true a correction for multiple testing is used. In any cas

RedCorrected

logical. If an outlier is detected we can try to find out wether a single component would be sufficient to drop the outlier under the outlier detection limit. Since in this second case we only check a few outliers no second correction step applies a

robust

A robustness description as define in var.acomp

Value

A factor classifying the observations in the dataset as "ok" or some type of outlier.

Details

See outliersInCompositions for a comprehensive introduction into the outlier treatment in compositions. See ClusterFinder1 for an alternative method to classify observations in the context of outliers.

Examples

Run this code

tmp<-set.seed(1400)
A <- matrix(c(0.1,0.2,0.3,0.1),nrow=2)
Mvar <- 0.1*ilrvar2clr(A%*%t(A))
Mcenter <- acomp(c(1,2,1))
data(SimulatedAmounts)
datas <- list(data1=sa.outliers1,data2=sa.outliers2,data3=sa.outliers3,
              data4=sa.outliers4,data5=sa.outliers5,data6=sa.outliers6)
opar<-par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))  
tmp<-mapply(function(x,y) {
outlierplot(x,type="scatter",class.type="grade");
  title(y)
},datas,names(datas))


par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))  
tmp<-mapply(function(x,y) {
  myCls2 <- OutlierClassifier1(x,alpha=0.05,type="all",corrected=TRUE)
  outlierplot(x,type="scatter",classifier=OutlierClassifier1,class.type="best",
  Legend=legend(1,1,levels(myCls),xjust=1,col=colcode,pch=pchcode),
  pch=as.numeric(myCls2));
  legend(0,1,legend=levels(myCls2),pch=1:length(levels(myCls2)))
  title(y)
},datas,names(datas))

par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))  
for( i in 1:length(datas) ) 
  outlierplot(datas[[i]],type="ecdf",main=names(datas)[i])
par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))  
for( i in 1:length(datas) ) 
  outlierplot(datas[[i]],type="portion",main=names(datas)[i])
par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))  
for( i in 1:length(datas) ) 
  outlierplot(datas[[i]],type="nout",main=names(datas)[i])
par(opar)