FactoInvestigate (version 1.7)

outliers: Outliers detection

Description

Detection of singular individuals that concentrates too much inertia.

Usage

outliers(res, file = "", Vselec = "cos2", Vcoef = 1, nmax = 10, 
         figure.title = "Figure", graph = TRUE, cex = 0.7, options = NULL)

Arguments

res

an object of class PCA or MCA.

file

a numerical vector giving the factorial dimensions for with to compute the eigen values calculation.

Vselec

the variables to select ; see the details section.

Vcoef

a numerical coefficient to adjust the variables selection rule ; see the details section.

nmax

an integer giving the maximum number of variables to illustrate each outlier (by default 10).

figure.title

the text label to add before graph title.

graph

a boolean : if TRUE, graphs are plotted.

cex

an optional argument for the generic plot functions, used to adjust the size of the elements plotted.

options

a character string that gives the output options for the figures. If NULL, options="r, echo = FALSE, fig.align = 'center', fig.height = 3.5, fig.width = 5.5" for linux and Mac and options="r, echo = FALSE, fig.height = 3.5, fig.width = 5.5" for Windows

Value

new.res

the res object without the outliers (they are completely eliminated).

res.out

the res object with the outliers as supplementary individuals.

memory

the original res object.

N

the number of outliers.

ID

the label of outliers.

Details

The algorithm detects an individual as an outlier if its contribution to the plane if higher to 3 standard deviation.

The Vselec argument is used in order to select a part of the variables that are drawn and described. For example, you can use either : - Vselec = 1:5 then the variables numbered 1 to 5 are drawn. - Vselec = c("name1","name5") then the variables named name1 and name5 are drawn. - Vselec = "contrib 10" then the 10 active or illustrative variables that have the highest contribution on the 2 dimensions of the plane are drawn. - Vselec = "contrib" then the optimal number of active or illustrative variables that have the highest contribution on the 2 dimensions of the plane are drawn. - Vselec = "cos2 5" then the 5 active or illustrative variables that have the highest cos2 on the 2 dimensions of the plane are drawn. - Vselec = "cos2 0.8" then the active or illustrative variables that have a cos2 higher to 0.8 on the plane are drawn. - Vselec = "cos2" then the optimal number of active or illustrative variables that have the highest cos2 on the 2 dimensions of the plane are drawn.

The Vcoef argument is used in order to adjust the selection of the variables when based on Vselec = "contrib" or Vselec = "cos2". For example : - if Vcoef = 2, the threshold is 2 times higher, and thus 2 times more restrictive. - if Vcoef = 0.5, the threshold is 2 times lower, and thus 2 times less restrictive.

Examples

Run this code
# NOT RUN {
require(FactoMineR)
data(decathlon)
res.pca = PCA(decathlon, quanti.sup = c(11:12), quali.sup = c(13), graph = FALSE)
outliers(res.pca, file = "PCA.Rmd")
# }

Run the code above in your browser using DataCamp Workspace