Performs PCA on panel data with bootstrap-based significance testing for factor loadings. Identifies which variables load significantly on each principal component using a null distribution constructed via block bootstrapping.
pca_bootstrap(
X,
n_components = NULL,
center = TRUE,
scale = TRUE,
n_boot = 200,
block_length = NULL,
alpha = 0.05,
use_fdr = FALSE,
rotation = c("varimax", "none", "oblimin"),
verbose = FALSE
)A list of class "signaly_pca" containing:
Matrix of factor loadings (rotated if specified)
Matrix of component scores
Vector of eigenvalues
Proportion of variance explained by each component
Cumulative proportion of variance explained
Matrix of logical values indicating significance
Matrix of bootstrap p-values for loadings
Cutoff values for significance by component
Shannon entropy of loadings for each component
Data frame summarizing each component
Data frame mapping variables to their dominant component
Matrix or data frame where rows are observations (time points) and columns are variables.
Number of principal components to extract. If NULL, determined by eigenvalue threshold or explained variance.
Logical. Center variables before PCA. Default TRUE.
Logical. Scale variables to unit variance. Default TRUE.
Number of bootstrap replications for significance testing. Default 200.
Block length for block bootstrap. If NULL, defaults
to ceiling(sqrt(nrow(X))).
Significance level for loading tests. Default 0.05.
Logical. Apply Benjamini-Hochberg FDR correction. Default FALSE.
Character string specifying rotation method: "none", "varimax", or "oblimin". Default "varimax".
Logical for progress messages.
High PC1 entropy: "Maximum entropy systemic stochasticity" - the dominant factor captures undifferentiated movement, suggesting noise rather than latent structure.
Low PC1 entropy: "Differentiated latent structure" - specific variables dominate, indicating meaningful groupings.
Significant loadings: Variables with p < alpha after bootstrap testing reliably load on that component.
The analysis proceeds in several stages:
1. Standard PCA: Eigendecomposition of the correlation (if scaled) or covariance matrix to extract principal components.
2. Rotation (optional): Varimax rotation maximizes the variance of squared loadings within components, producing cleaner simple structure. Oblimin allows correlated factors.
3. Bootstrap Significance Testing: For each bootstrap replicate:
Resample rows using block bootstrap (preserving temporal dependence)
Perform PCA on resampled data
Apply Procrustes rotation to align with original
Record absolute loadings
The empirical p-value for each loading is the proportion of bootstrap loadings exceeding the original in absolute value.
4. Entropy Calculation: Shannon entropy of squared loadings indicates whether explanatory power is concentrated (low entropy) or diffuse (high entropy). High entropy on PC1 suggests systemic co-movement rather than differentiated structure.
Jolliffe, I. T. (2002). Principal Component Analysis (2nd ed.). Springer.
Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23(3), 187-200.
set.seed(123)
n <- 100
p <- 10
X <- matrix(rnorm(n * p), ncol = p)
colnames(X) <- paste0("V", 1:p)
result <- pca_bootstrap(X, n_components = 3, n_boot = 50)
print(result$summary_by_component)
Run the code above in your browser using DataLab