Learn R Programming

Anthropometry (version 1.1)

archetypesBoundary: Archetypal analysis in multivariate accommodation problem

Description

This function allows us to reproduce the results shown in section 2.2.2 and section 3.1 of Epifanio et al. (2013). In addition, from the results provided by this function, the other results shown in section 3.2 and section 3.3 of the same paper can be also reproduced (see section examples below).

Usage

archetypesBoundary(data,numArchet,verbose,nrep)

Arguments

data
USAF 1967 database (see dataUSAF). Each row corresponds to an observation, and each column corresponds to a variable. All variables are numeric.
numArchet
Number of archetypes.
verbose
Logical value. If TRUE, some details of the execution progress are shown (this is the same argument as that of the stepArchetypes function of the archetypes R package (Eugster (2009))).
nrep
For each archetype run archetypes nrep times (this is the same argument as that of the stepArchetypes function of archetypes).

Value

  • A list with numArchet elements. Each element is a list of class attribute stepArchetypes with nrep elements.

Details

Before using this function, the more extreme (100 - percAcomm*100)% observations must be removed by means of the accommodation function. To this end, it is recommended that you use the Mahalanobis distance. In this case, the depth procedure has the disadvantage that the desired percentage of accommodation is not under control of the analyst and it could not coincide exactly with that indicated.

References

Epifanio, I., Vinue, G., and Alemany, S., (2013). Archetypal analysis: contributions for estimating boundary cases in multivariate accommodation problem, Computers & Industrial Engineering 64, 757--765.

Eugster, M. J., and Leisch, F., (2009). From Spider-Man to Hero - Archetypal Analysis in R, Journal of Statistical Software 30, 1--23, http://www.jstatsoft.org/.

Zehner, G. F., Meindl, R. S., and Hudson, J. A., (1993). A multivariate anthropometric method for crew station design: abridged. Tech. rep. Ohio: Human Engineering Division, Armstrong Laboratory, Wright-Patterson Air Force Base.

See Also

archetypes, stepArchetypes, stepArchetypesMod, dataUSAF, indivNearest, accommodation

Examples

Run this code
#The following R code allows us to reproduce the results of the paper Epifanio et al. (2013).
#First,the USAF 1967 database is read and preprocessed (Zehner et al. (1993)).
m <- dataUSAF
#Variable selection:
sel <- c(48,40,39,33,34,36)
#Changing to inches: 
mpulg <- m[,sel] / (10 * 2.54)

#Data preprocessing:
preproc <- accommodation(mpulg,TRUE,0.95,TRUE) 

#Procedure and results shown in section 2.2.2 and section 3.1:
res <- archetypesBoundary(preproc$data,15,FALSE,3)

#Results shown in section 3.2 (figure 3):
screeplot(res) 

#3 archetypes:
a3 <- archetypes::bestModel(res[[3]])
archetypes::parameters(a3)
#7 archetypes:
a7 <- archetypes::bestModel(res[[7]])
archetypes::parameters(a7) 
#Plotting the percentiles of each archetype:
#Figure 2 (b):
barplot(a3,preproc$data,percentiles=T,which="beside") 
#Figure 2 (f):
barplot(a7,preproc$data,percentiles=T,which="beside")

#Results shown in section 3.3 related with PCA.
pznueva <- prcomp(preproc$data,scale=T,retx=T) 
#Table 3:
summary(pznueva)
pznueva
#PCA scores for 3 archetypes:
p3 <- predict(pznueva,archetypes::parameters(a3)) 
#PCA scores for 7 archetypes:
p7 <- predict(pznueva,archetypes::parameters(a7))
#Representing the scores:
#Figure 4 (a):
xyplotPCA(p3[,1:2],pznueva$x[,1:2],data.col=gray(0.7),atypes.col=1,atypes.pch=15)
#Figure 4 (b):
xyplotPCA(p7[,1:2],pznueva$x[,1:2],data.col=gray(0.7),atypes.col=1,atypes.pch=15)

#Percentiles for 7 archetypes (table 5):
Fn <- ecdf(preproc$data)
round(Fn(archetypes::parameters(a7)) * 100)

#Which are the nearest individuals to archetypes?:
#Example for three archetypes:
ras <- rbind(archetypes::parameters(a3),preproc$data)
dras <- dist(ras,method="euclidean",diag=F,upper=T,p=2)
mdras <- as.matrix(dras)
diag(mdras) = 1e+11
i <- 3
nearest <- sapply(1:i,indivNearest,i,mdras)

#In addition, we can turn the standardized values to the original variables.
p <- archetypes::parameters(a7)
m <- sapply(mpulg,mean)
s <- sapply(mpulg,sd)
d <- p
for(i in 1 : 6){
 d[,i] = p[,i] * s[i] + m[i]
}
#Table 7:
t(d)

Run the code above in your browser using DataLab