This function performs comprehensive analysis of variable importance in Principal Component Analysis, generating multiple visualization types including loading biplots, importance rankings, PC comparisons, and heatmaps. It extracts variable contributions to specified principal components and creates publication-ready plots with detailed statistical summaries.
analyze_pca_variable_importance_general(
pca_result = NULL,
output_dir = tempdir(),
experiment_name = "PCA_Analysis",
pc_x = "PC1",
pc_y = "PC2",
color_scheme = "default",
top_n = 15,
min_loading_threshold = 0.1,
save_plots = TRUE,
show_labels = TRUE,
verbose = TRUE
)A list containing:
Named list of ggplot objects: 'biplot', 'importance_bar', 'pc_comparison', 'heatmap'
Data frame with comprehensive variable importance metrics for all variables
Data frame containing the top N most important variables with detailed statistics
List with key analysis metrics and variance explained information
List documenting all parameters used in the analysis
A PCA result object. Can be either a prcomp object directly, or a list
containing a PCA object in fields named 'pca_result', 'pca', 'result', or 'prcomp'.
Character string specifying the directory for saving plots and results (default: "pca_plots").
Character string used as a prefix for output files and plot titles (default: "PCA_Analysis").
Character string specifying the principal component for x-axis analysis (default: "PC1").
Character string specifying the principal component for y-axis analysis (default: "PC2").
Character string specifying the color palette. Options: "default", "viridis", "colorbrewer" (default: "default").
Numeric value specifying the number of top variables to focus on in detailed analyses (default: 15).
Numeric value specifying the minimum loading threshold for importance filtering (default: 0.1).
Logical indicating whether to save plots and results to disk (default: TRUE).
Logical indicating whether to show variable labels on the biplot (default: TRUE).
Logical indicating whether to print detailed progress messages (default: TRUE).
When save_plots = TRUE, the function creates files in the specified
output directory (default: "pca_plots"). For CRAN compliance, use tempdir()
for the output directory:
PNG files for each visualization type
CSV file with complete variable importance rankings
CSV file with selected top variables and detailed metrics
CSV file with analysis summary and metadata
The function calculates multiple importance metrics for each variable:
PC loadings: Direct loading values for specified principal components
Combined importance: Euclidean distance combining both PC loadings
Contribution percentages: Percent contribution to each PC's total variance
Ranking: Variables ranked by combined importance score
Four visualization types are generated:
Loading Biplot: Scatter plot showing variable loadings on both PCs with size indicating importance
Importance Bar Chart: Ranked bar chart of top variables by combined importance
PC Comparison: Side-by-side comparison of absolute loadings for both PCs
Loading Heatmap: Color-coded matrix showing loading values and directions
The function automatically:
Validates input PCA objects from various sources
Calculates variance explained by each principal component
Creates publication-ready plots with consistent theming
Exports detailed CSV files with variable rankings and analysis summaries
Provides comprehensive statistical summaries
Color schemes provide different aesthetic options:
default: Blue/red palette suitable for most publications
viridis: Colorblind-friendly viridis color scale
colorbrewer: ColorBrewer palettes optimized for scientific visualization
View top variables using head(results$selected_variables)