Learn R Programming

fungible (version 2.2)

rPCA: Generate a Correlation Matrix from a Truncated PCA Loadings Matrix.

Description

This function generates a random (or possibly unique) correlation matrix (R) from an unrotated or orthogonally rotated PCA loadings matrix via a modified alternating projections algorithm.

Usage

rPCA(
  F,
  epsMax = 1e-18,
  maxit = 2000,
  Seed = NULL,
  InitP2 = 2,
  Eigs = NULL,
  PrintLevel = 1
)

Arguments

F

(Matrix) A p (variables) by k (components) PCA loadings matrix. F can equal either an unrotated or an orthogonally rotated loadings matrix.

epsMax

(Scalar) A small number used to evaluate function convergence. Default (epsMax = 1E-18).

maxit

(Integer) An integer that specifies the maximum number of iterations of the modified alternating projections algorithm (APA).

Seed

(Integer) A user-defined starting seed for the random number generator. If Seed = NULL then rPCA will generate a random starting seed. Setting Seed to a positive integer will generate reproducible results. Default (Seed = NULL)

InitP2

(Integer) The method used to initiate the remaining columns of the truncated principal components solution. If InitP2 = 1 then the starting P2 will be a random semi-orthogonal matrix. If If InitP2 = 2 then the starting P2 will be a semi-orthogonal matrix that is in the left null space of P1. Default (InitP2 = 2). Of the two options, InitP2 = 2 generally converges to a single feasible solution in less time. InitP2 = 1 can be used to generate different solutions from different starting seeds.

Eigs

(Vector) Under some conditions, rPCA can generate (or reproduce) a unique correlation matrix with known (i.e., user-specified) eigenvalues from a truncated PC loadings matrix, F, even when the rank of F is less than p (the number of observed variables). Eigs is an optional p-length vector of eigenvalues for R. Default (Eigs = NULL).

PrintLevel

(Integer) If PrintLevel = 0 no output will be printed (choose this option for Monte Carlo simulations). If PrintLevel = 1 the program will print the APA convergence status and the number of iterations used to achieve convergence. If PrintLevel = 2 then rPCA will print the iteration convergence history of the modified APA algorithm. Default (PrintLevel = 1).

Value

  • R (Matrix) A p by p correlation matrix that generates the desired PCA loadings.

  • Tmat (Matrix) A k by k orthogonal rotation matrix that will rotate the unrotated PCA loadings matrix, P1, to F (if F is an orthogonally rotated loadings matrix).

  • P1 (Matrix) The p by k unrotated PCA loadings matrix that is associated with F.

  • Fhat (Matrix) The p by k estimated (and possibly rotated) PCA loadings matrix from the simulated matrix R.

  • error (Logical) A logical that indicates whether F is a legitimate PCA loadings matrix.

  • Lambda (Vector) The sorted eigenvalues of R.

  • iterHx (Vector) Criterion (i.e., fit) values for for each iteration of the modified APA algorithm.

  • converged (Logical) A logical that signifies function convergence.

  • Seed (Integer) Either a user-defined or function generated starting seed for the random number generator.

References

Escalante, R. and Raydan, M. (2011). Alternating projection methods. Society for Industrial and Applied Mathematics.

ten Berge, J. M. and Kiers, H. A. (1999). Retrieving the correlation matrix from a truncated PCA solution: The inverse principal component problem. Psychometrika, 64(3), 317--324.

Examples

Run this code
# NOT RUN {
# External PCA function ---
# used to check results
 
PCA <- function(R, k = NULL){
  if(is.null(k)) k <- ncol(R)
  VLV <- eigen(R)
 V <- VLV$vectors
 L <- VLV$values

 
 if( k > 1){
   P <-  V[, 1:k] %*% diag(L[1:k]^.5)
 }
 else{
   P <- as.matrix(V[, 1], drop=False) * L[1]^.5
 }
  Psign <- sign(apply(P, 2, sum))
  if(k > 1) Psign = diag(Psign)
  P <- P %*%  Psign
 P
}#END PCA  

  
## Generate Desired Population rotated PCA loadings matrix
## Example = 1
 k = 2
 F <- matrix(0, 8, 2) 
 F[1:4, 1] <- seq(.75, .72, length= 4)  
 F[5:8, 2] <- seq(.65, .62, length= 4)  
 F[1,2] <- .1234
 F[8,1] <- .4321
 colnames(F) <-   paste0("F", 1:k) 
 (F)
 
 ## Run Example 1
 pout <- rPCA(F, 
              maxit = 5000, 
              Seed = 1, 
              epsMax = 1E-18,
              PrintLevel = 1)
pout$converged
eigen(pout$R)$values
if(pout$error == FALSE & pout$converged){ 
    Fhat <- pout$Fhat
    cat("\nPCA Loadings\n")
    ( round( cbind(F,Fhat ), 5) )
 }
 
 ## Example = 2      
 ## Single component example from Widaman 2018

 k = 1
 F <- matrix(rep(c(.8,.6, .4), each = 3 ), nrow = 9, ncol = 1)
 colnames(F) <-   paste0("F", 1:k) 
 (F)

 ## Run Example 2
 pout <- rPCA(F, 
              maxit = 5000, 
              Seed = 1, 
              epsMax = 1E-18,
              PrintLevel = 1)
 pout$converged
 pout$Fhat
 eigen(pout$R)$values
if(pout$error == FALSE & pout$converged){ 
    Fhat <- pout$Fhat
    cat("\nPCA Loadings\n")
    ( round( cbind(F,Fhat ), 5) )
 }
  
## Example 3 ----
## 2 Component example from Goldberg and Velicer (2006).
 k = 2
 F = matrix(c( .18, .75,
               .65, .19,
               .12, .69,
               .74, .06,
               .19, .80,
               .80, .14,
              -.05, .65,
               .71, .02), 8, 2, byrow=TRUE)
 colnames(F) <-   paste0("F", 1:k) 
 (F)

## Run Example 3
pout <- rPCA(F, 
            maxit = 5000, 
            Seed = 1, 
            epsMax = 1E-18,
            PrintLevel = 1)
pout$converged

eigen(pout$R)$values

if(pout$error == FALSE & pout$converged){ 
  Fhat <- pout$Fhat
  cat("\nPCA Loadings\n")
  ( round( cbind(F,Fhat ), 5) )
#
#
## Example 4
#
SEED = 4321
set.seed(SEED)
k= 3
## Generate eigenvalues for example R matrix
L7 <- eigGen(nDimensions = 7,
             nMaj = 3,
             PrcntMajor = .85,
             threshold = .8)

## Scree Plot
plot(1:7, L7, 
    type = "b", 
    ylim = c(0,4),
    main = "Scree Plot for R",
    ylab = "Eigenvalues",
    xlab = "Dimensions")

## Generate R
R <- rGivens(eigs=L7, Seed = SEED)$R
print( R, digits = 4)

#Extract loadings for 3 principal components
F <- PCA(R, k = k)

# rotate loadings with varimax to examine underlying structure
print( round(varimax(F)$loadings[], 3) )

## run rPCA with user-defined eigenvalues
rout <- rPCA(F,
            epsMax = 1e-20, 
            maxit = 25000, 
            Seed = SEED,   
            InitP2 = 1,
            Eigs = L7,
            PrintLevel = 1) 

## Compute PCA on generated R

Fhat <- PCA(rout$R, k = 3)
#
## align factors
Fhat <- fungible::faAlign(F, Fhat)$F2

## Compare solutions
print( round( cbind(F, Fhat), 5) )

## Compare Eigenvalues
print( cbind(L7, eigen(rout$R)$values ), digits=8) 
#
## Compare R matrices: 8 digit accuracy
print( round(R - rout$R, 8) )

}
# }

Run the code above in your browser using DataLab