Last chance! 50% off unlimited learning
Sale ends in
outlierplot(X,...)
## S3 method for class 'acomp':
outlierplot(X,colcode=colorsForOutliers1,pchcode=pchForOutliers1,
type=c("scatter","biplot","dendrogram","ecdf","portion","nout"),
legend.position,pch=19,...,clusterMethod="ward",
myCls=classifier(X,alpha=alpha,type=class.type,corrected=corrected),
classifier=OutlierClassifier1,
alpha=0.05,
class.type="best",
Legend,pow=1,
main=paste(deparse(substitute(X))),
corrected=TRUE,robust=TRUE,princomp.robust=FALSE,
mahRange=exp(c(-5,5))^pow,
flagColor="red",
meanColor="blue",
grayColor="gray40",
goodColor="green",
mahalanobisLabel="Mahalanobis Distance"
)
acomp
objectmyCls
,
or function to create it from the factor. Use colorForOutliers2
if
class.method="all"
is used.myCls
callhclust
based outlier grouping.myCls
.classifier
robustnessInCompositions
type="scatter"
type="biplot"
coloredBiplot
is used rather than the usual one.
} type="dendrogram"
type="ecdf"
meanColor
. The
alpha
-quantile -- i.e. a lower prediction bound -- for the
cdf is given in goodColor. A line in grayColor
show the
minium portion of observations above some limit to be
outliers, based on the portion of observations necessary to move
down to make the empirical distribution function get above its lower
prediction limit under the assumption of normality.
This plot shows the basic construction for the minimal number of
outlier computation done in type="portion"
.
}
type="portion"
meanColor
we see a curve of an estimated
number of outliers above some limit, generated by estimating the
portion of outliers with a Mahalanobis distance over the given
limit by max(0,1-ecdf/cdf). The minimum
number of outliers is computed by replacing cdf by its lower
confidence limit and displayed in goodColor
. The
Mahalanobis distances of the individual data points are added as a
stacked stripchart
, such that the influence of
individual observations can be seen.
The true problem of outlier detection is to detect "near"
outliers. Near outliers are outliers so near to the dataset that
they could well be extrem observation. These near outliers would
provide no problem unless they are not many showing up in
groups. Graphic allows at least to count them and to show there
probable Mahalanobis distance such, however it still does not
allow to conclude that an individual observation is an
outlier. However still the outlier candidates can be identified
comparing their mahalanobis distance (returned by the plot
as$mahalanobis
) with a cutoff inferred from this graphic.
}
type="nout"
OutlierClassifier1
, ClusterFinder1
data(SimulatedAmounts)
outlierplot(acomp(sa.outliers5))
datas <- list(data1=sa.outliers1,data2=sa.outliers2,data3=sa.outliers3,data4=sa.outliers4,data5=sa.outliers5,data6=sa.outliers6)
opar<-par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))
tmp<-mapply(function(x,y) {
outlierplot(x,type="scatter",class.type="grade");
title(y)
},datas,names(datas))
par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))
tmp<-mapply(function(x,y) {
myCls2 <- OutlierClassifier1(x,alpha=0.05,type="all",corrected=TRUE)
outlierplot(x,type="scatter",classifier=OutlierClassifier1,class.type="best",
Legend=legend(1,1,levels(myCls),xjust=1,col=colcode,pch=pchcode),
pch=as.numeric(myCls2));
legend(0,1,legend=levels(myCls2),pch=1:length(levels(myCls2)))
title(y)
},datas,names(datas))
# To slow
par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))
for( i in 1:length(datas) )
outlierplot(datas[[i]],type="ecdf",main=names(datas)[i])
par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))
for( i in 1:length(datas) )
outlierplot(datas[[i]],type="portion",main=names(datas)[i])
par(mfrow=c(2,3),pch=19,mar=c(3,2,2,1))
for( i in 1:length(datas) )
outlierplot(datas[[i]],type="nout",main=names(datas)[i])
par(opar)
Run the code above in your browser using DataLab