CDpca performs a clustering and disjoint principal components analysis (CDPCA) on the given numeric data matrix and returns a list of results Given a (IxJ) real data matrix X = [xij], the CDPCA methodology is allowed to cluster the I objects into P nonempty and nonoverlapping clusters Cp, p = 1,...,P, which are identified by theirs centroids, and, simultaneously, to partitioning the J attributes into Q disjoint components, PCq, q = 1,...,Q. The CDpca function models X estimating the parameter of the model using an Alternating Least Square (ALS) procedure originally proposed by Vichi and Saport (2009) and described in two steps by Macedo and Freitas (2015).
CDpca (data, class=NULL, P, Q, SDPinitial=FALSE, tol= 10^(-5), maxit, r, cdpcaplot=TRUE)Cdpca returns a list of results containing the following components:
The total number of iterations used in the best loop for computing the best solution
The best loop number
The computation time on the best loop
The computation time for all loops
The component score matrix
The object centroids matrix in the reduced space
The component loading matrix
The partition of objects
The partition of variables
The value of the objective function to maximize
The between cluster deviance
The between cluster deviance over the total variability
The cdpca classification
The pseudo confusion matrix concerning the true (given by class) and cdpca classifications
The error norm for the obtained cdpca model
A numeric matrix or data frame which provides the data for the CDPCA
A numeric vector containing the real classification of the objects in the data, or NULL if the class of objects is unknown
An integer value indicating the number of clusters of objects
An integer value indicating the number of clusters of variables
A logical value indicating whether the initial assignment matrices U and V are randomly generated (by default) or an algorithmic framework based on a semidefinite programming approach is preferred (TRUE)
A positive (low) value indicating the maximum term for the difference between two consecutives values of the objective function. A tolerance value of 10^(-5) is indicated by default
The maximum number of iterations of one run of the ALS algorithm
Number of runs of the ALS algorithm for the final solution
A logical value indicating whether an additional graphic is created (showing the data projected on the first two CDPCA principal components)
Eloisa Macedo macedo@ua.pt, Adelaide Freitas adelaide@ua.pt, Maurizio Vichi maurizio.vichi@uniroma1.it
Vichi, M and Saporta, G. (2009). Clustering and disjoint principal component analysis. Computational Statistics and Data Analysis, 53, 3194-3208.
Macedo, E. and Freitas, A. (2015). The alternating least-squares algorithm for CDPCA. Communications in Computer and Information Science (CCIS), Springer Verlag pp. 173-191.